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$1.1 introduction 
The creation of « compar for « specif language and target machine is an 


ost iap Bete i) eee 
arduous process. "te not wncomman to Invest saveral years in the production of 
a ges A ga Oe ONE 
an acceptable compiler; tha excelent compliers avaiable for PL/I on MULTICS and 
ies DRE a teu 
System 870 evalved over « decade or more. With the rapid development of new 
Gee: HS SORA ae 


computing hardware and the proliferation of high-tevel lenguages, ere 

Investment le no longer practod,sepanay If there te ite campover from ona 
eee a ls caetnes BAY atcees el ame 

: Comper waters currently suffer from the same melady as the shoomeker's 


chien: they scam to be ihe lst to banat fm tw iro ds hae 

language technoiogy that pt probamaront a Raed” £ oe have wenn 

the same high-level scale gee he paolo a earn hoch Gua 
bee, weaal eee 


compiler production, aysteme have been developed to automatically generate those 


Gorilone” of Ah. Gawciay wall walslete: ing deuce iukcsane piagen tren 
se eg treny gare Mbt Po rap tokin 2PAge@as oo 


internal, term: G@uitabie foc Gedo generoton: ‘These systems have enhanced 
portability and exteneioity of the remuttent compler witout a slgifosnt 


tek, Meee ebbives : 
degradation In tx performance ‘The thel phases of « compte, thoee concemed 
pet eral Aree Bo ghee gee 


with code anenba ee ee ae we anlar coaretian, Many: sdifferent 
approaches are poaninie (one 51.4) :thle. thane sdhiensent'thaciesye of providing a 


Pee yd Bates 


specification ‘of a cedh Ganson: -2uch2-enecieation ts Lcametructed bye the code 
generator- designer yéthin a framework. provided by: an: /ntermediate language (Il) 
ve sayy , ig yn aha gh os Bt Ca te Pe BS “¢ 4 ee: - eee 


be 
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end «a moetainterpréter. The intermediate language je used as the internal 
representation of the code generator. — the Initial input (provided by the first phase 


of the compiler) le @ seurce | 4 as an IL program; the final 


output is the tt representation of. tre ereet machine program. The motainterpreter. 
has a datated understanding of the senention of 1. programe and le capable of 


Pere dei bag> tax? | SARL & ae 
performing any waneforwetions end optinizetions on those programe. The 
somentica of Hare tited to concopte common to many languages and machines: 
eee Eee Pye TS, 
fow of contol and the management of nanee and values ere the only priniive 


t oe ee siad. Hae Sab gee’ 


comentics (o.g., the 
semantics of divide! operators) are provided by the <esigner In the form of « 
ve PEE EE Lapa 


transformation catalogue. in eesence, the as of rn serve as commen ground 


on which the designer (through the transiormation catalogue) “explana” the source 
aye yergts Bee ay 


language and target machine to the metabterpreter which then performs the 


hy Taal gee Ge ui 


appropriate tranelation. “Ti “anplanation” le mt tarme of stapby-ete eyptactc 


: Laat ge sert hry 
manipulation ot the u “program oneh transformation socumates, _adiditionel 
Se Rt ges ere “he age te 
Infomation for the metahterprater or provides pecsble tansitione for 
statements which are not yet target machine Inetrotions. ‘Bince the 
ne Lasdiceeyell: vned @yaee Jatt yet etna * 


matainterpreter corporates many of the optimizations commonly pertormed by 
compliers, the epecltoaton need not supply Getaled tplomertaton descriptions of 
these operations. | 

One can envision several datinct wees for euch « apecitcation: 


_ ane “or he: Me % Feo B ret 5 aed Go PL. 


interpreted to prodics an soceptabie, tranetation (e.g ne 


© as an pat.to a speten witelt eutemnticaly -caliwects i code 
Seen Te or rere ne errr 


Each successive use requires @ more thorough understanding of the specification 
but repays this investment with a corresponding increase: in: ‘the degree: bf 
automation achieved. The. lncregse is besed.tor the amet:.part ‘on a! better 
understanding of the interaction between. eempenunts 2cf! the specification: 
Automatic creation. of Me Rade. gonerntec::: feon a ‘enecification:..wroutd require 


, Set Pt eee 


extenahy snsiysie of, thaoe intracione,& sepa ny som-daet emerging trom 
artifolal ntetigence research on program synthesie [Baretow]. “Fortunately most of 
the analytical mechaniom required ls in addition to the facities provided by the 
metainterpreter and intermediate language — it le reasonable to expect that future 
research wif be’ able to extend the framework described in the preceding 


et 


peragtaph to ‘alow automatic \raaduction of a ‘code generator. ‘This thesis 
concentrates on developing’ the traneweni tthe polt where it can be used 
interpratively (as suggested by | ‘the second week \nplomented in a ‘atraightforward 


fashion, the metainterpreter oan “perform the ‘trarolation by alternately applying 


cm 


transformations from the ‘cetalogue and cntinting te updated iL program. While 


nigel fs eet 


this approach is admittedly lees “eficlent than current code B storduioyies ee 


er 2 


represents a significant step towards oeparating machine “and “language 
dependeticise ‘ina declarative form: (the transformation catalogue) from general 
knowledge about code ghneration (embodied in the matahnterrrter) a . 
| “The fotcwing' section provides a bref overviw of the tasks confronting & 


enna: emer See ras 1 


code generator. “$1.8 presenta a eummery ot the salient features of IL, the 


transformation catalogue, end the metainterpreter. in $1.4, related work ie 


re oa 
thay SEY, BE * w. 


discussed with an eye towards providing & geneahgy for the research reported 


“yah Fae: 


here. Finally, 11.8 cutines th organization othe renner ofthe thet 
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§1.2 Setting the stage 
Before embarking on a discussion of ‘the proposed formaliom, fet us first 


The idea, of course, is that by exeouting the resuhing sequence of machine 
instructions the target machine wil cerry out the specited computation. The 
remainder of thie section outlines the tacks confronting « code generator; our 
objective le to sketch the varisty ot knowledge necded for making decisions during 
An optimizing code generator ts organized around three mein tasks: 

Machine-independent optimizations inode geal flow analysis, constant 

PORE, SAND Re ee ee ee 
these transtormations modity the semantic tree, producing « new tree which is 
strictly equivalent (Le., equivalent regerdiess of the choloe of target machine). 
Certain of these transformations do make genere! sesumptions bout the target 
machine architecture; for Instance, conetant propagation essumes that it is more 
siecle Ao) Guoscs a Goaniaie “Gidg 4 ssarNine The more sophisticated code . 
generatore [Wulf] do not actually modity the eomantio tree - they maintain a let of 
alternatives for each node in the tree’, postponing the oheice of transformation 
t They do not, however, Nest all possible eiternatives eas this would result in the 


- Combinatorial growth of the semantic tree. Searching the ful tree for the optimal 
acc i cca a aa rg aaa cE 


4. - Chapter One ee stage 
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until the translation phase. 
The teaneietion * peat ssecpecgeal nemuouene ee m: aoe stapae: 


2 URS 


oomestetions. ae: 20 se gy teh determined... 
of redundant computationa, It is. often possible 


(lv) Actual tacgat..: maghine. . instructions: atec«genanttat; Machiie > 
dependent 


conelderations (such as locations of operands for 
sa 1 ~ealaaalalaen paceman a ol 


= clue 


witch Implement the ‘requred computations (FORcore, ; 


From the many possible transtormations soptoabe to a pertiouar source “program, 


Pye dso ty EE Ti ate: ae 
an optimizing “code ‘penelator ena: some “subset c produce, ‘the "best" 


ar 2 ne 


translation. “Thete tronaformatons “are interdgpandent and an a prior! 


ee. 0 Oe ot 


determination of their combined effect is eestor 
| Machine-dependent (peaphote) optimization [MoKeeman, watt : Chapter 6] of 


ee iE SERIE aa as 2 


Instruction sequences ‘can be used to improve io ‘Generated code - Just how much 


BySh 


laprovemnent can be made depends on ‘the sophistication of the translation phase. 


as BS 


The goai ia 15 subette more efilent Instruction seqiences for anal portions of 


ePeis 


the code. Examples: elimination of jumps to other jumps and code following 


unconditional jumps, use of short-address jumps (lemited in:tiow:. far they den: jump),’’ 


elimination. of. redundant. store-load. sequenane,; ete: Thiet phése te-iterated: until no 


more improvements can be made. Before the. render dumenee this fhal:phase as 


“trivial,” he, should. cqnsiger thie: comment: from [Wut pg: 4248 : 

t Sa agtinva ba wl fen o or ae 
ar) the, fancy, catimizatiqn io: the vatid  netinadsly on inpertart. as 
careful: Heroushy auploitation-icf: tt sargutcimachine.v 16" 
difficult to to what extent [thie final phase] wetié de 
snneded It seve sombinte Slgemes: ee existed in | 


ot ae os 


tea i One - Ste the aoe | ee ele Bs 7 


ee ie eonipiier. Howevet, agine@. some of the 
operetione of [thle final pene] exist simply because the requsite 

eee - 
be a rele for «:felatiar wodete} i. 


ft should be noted that relatively few of the, wansformetions mentioned 


- ‘above are uniformly appliosble. ‘Untortunatety, the ‘senivétititiel corte! structures 


upon which extant code. onerators are based preotode, @ thabend-error approach 
to optimization. Tre -programme, ‘using hie knowtedje’ ot the target machine 
architecture, must, out of necessity, incorporate in the code generitor efther some 
cubset of the appiieable transformations or heurletic# to ‘select the “best” 
traneformation at speoliic pointe in the code generation prscess. These heuristics 
base thelr decisions on a local examination of the tree; more far-reaching 
consequences are dificult to determine — thus, moet heurtice "wore" for only « 
eubsat (albeit large) of the possible programe. Nthough the compromises inherent 
In heuristics werve primary to reduce the amount of computation nested to 
complete the tranelation, they also embody knowledge helpful in the generation of 
code. Some of these transformations are of general use in thet they are 
Independent of both the Intermediate representation and the target machine; these 
transformations form a nucieus of knowledge for the portable code generation 


system. 


§1.3 - introduction to 1./ML. 
Tie: Semone te: es Seer Oe) Soe eenerenes: Proved hy the 
IL/ML system bes three basic components: | 
@an Intermediate language. (iL) which serves = oe ‘ternal 


@a transformation catalogue whose component trenéformations are 
expressed in a context-sensitive pattermmatching metelanguage (ML) 
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By TEENA BE TOT ea Op, UBL: RTT Baers oe EA 


as pattern/repiacement pairs. The pattern specifies the context of 
the transformation ae an {lL program fragment; the pees Is 
another fragment te, be substituted. fenthe meteeeenene te: 


k, code aenerstion’ "aay. ‘be Viewed ‘as 5 followel: 
transformation catalogue le eeaicied by the metainterpreter una pattern le found 
that mardi ine tame ot the curent 1 program, then the corresponding 


BOWS MAE SE 


replacement le substituted for the matched “creating an updated Version 
apa eygt aes 


of the IL program. “Wext, the metainterpreter ‘optimizes the Ma version tthe 
Me : 


program utitzing new Infomation and opportunities presented by the transformation. 

ae sary OM ERT ort 

This cycle is repeated until no further matches can be found, at which point the 
erent Se ne 


translation Is completed. “The sinplitty of the mechaniem, slong with the modularity 
ot the transformation data bese, , make thi an attractive basi for « code generator 


yiepee AC cp ten? 


specification. 
Only ‘concepts Somaoh to most machine and source language programe have 
been incorporated into IL and the metainterpreter — ecnoapte epectta to machine 


eget yg Hee 
or language are Introduded by the desigiier through the transformation catalogue. 
a eieaysy JRA ak 3 
Many of these new ochospte need never be related to the primitives of IL: they 


can be introduced Into the ik program as airibuiee of some component of the iL 
program where they can be referenced by transformations. “The semantics of 
these attributes are established by the role’ they play in various tranaformations 


a ernnaet Rien ERA ome synod sd Pot eee 
t Thi, dovcrbion le ony a cinosptual model) in a code generator constructed 


Las decigicen,., harem. in .ahpenng and. applying a: 
ih nave boon ordered by tne ime ie eorporated fy 
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for example, thease om eden apart: it iy | be related to the 
Integer and feating-point aanion inotmsctions: of the target wpotine — ne 
the metanterpeeter ine te ewppert deter ails ane vi 
aorene source ienquege sonantios a ‘ome. at other, imple 
semantics ahews greet taney without. my. 
Imermediate language oF watainternroter ee 
But lent Ht Soheeting®: to requive. the designe 
ahi, atc | ant alse. 
the objection to conventional code. gqaner ore, va. thet 4 lene investment Ag. 
necessary to redo the Wansiatons when ener target machine or scurce 
language le to be accommodated? ‘We, nat really. There le oe “magic” provided by 
the IL/ML aystom ~ the eomentice of the seuren language and target machine must 
ee 


terme of one anather ~ ie sd ne dai sree Sc i an b both and | 
the cote st sha WAR. svete inane he. soot, supeeten in any, other. 
language/interpreter. Moreover, ince the. nore 
necessary knowedge about genera) optimization tea : 
description le onal compared to coding & convartqnal ade generator. tt le true 


Met mere Betty. comet, Menmnedinte eguare. Sehesire, ca, te ore 


abstract machines tw aay os 1s, . % ve a gilttitel ” purpos code 
pensiation syatem, aush- conatreimts have been avoided. 
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§1.3.1 A syntactic model of code generation 

One of the most useful Seroveriee of artificial Intelligence research is that 
complicated semantic manipulations can be accomplished with step-by-step 
syntactic mraniprale tion of an appropriately howen date 2ase. (see, for. example, 
‘[Hewitt}). ‘This section explores the apptoation of this approach to. the process of 
oode generation. The objective “of this exploration Ld to. provide. a , afergnt 
perspective of the n/m ystem - -hopatuly thie tena lad to a. better designed 
transformation catalogue. | ~ 

“One ‘can characterize ious generation as a “consecutive “sequence: of 
transformations chosen from ‘the wanaornaton ‘aatalogue and applied to an 
intermediate language input string: 

Sic mechats and, (is me 7% 8, Stenget. machine’ a 

Starget machine Is not dopapaneted unique; _ a code » generation cSlogeithm may 
have to eneces: mong many transtations. tt the translation yses an abstract 


machine then we “wi have on 


- Sintermediate 781 7 a Mt > Oa et * ae: a ener 


The | transformations leading to Am are ndependent of Me. terest, ,machine; the 
transformations folowing cM are machine dependent. a we Prove transformations 


SPER SEE SS aD & 


according to the code generation steps they describe (e.g, storage | alioca’ Ry, 


register eadigneent, ‘eto), eaak, group dances the transiation pf. Programs. for a 


parpuer aetract: machine into eroarems bid another. ey. Mefining « a hierarchy of 


abstract machines, the ® designer « can Hit the ee of a pertoular | feature. of the. 


target machine to a “few transformations. This: type. of _ organization _ of he 


transformation catalogue leads toe a tatty moduler specification. 
As was ‘mentioned ‘above, ‘the reeuiting machine. Jenavase. Program Is not 


atways ‘unique — in cnet to be able to decide senmne: competing | translations, . it is. 
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mee yeuees oe 
masRue - 

This totaly ordered meneure le to refect the optinaty of the translation; the 

omatier the mesure, the more optimal the tranalation. Note that the measure Is not 


completed’ translation. “Tywlealy tie mesaure le computed from the values of 
attributes of the statements in the final program: it fe up to the designer to, eneure 
that each statement is assigned theve attributes - ff some statement does not 
have the appropriate attributes defined, the measure ee that IL Program wil be 

undefined. The final choice for a ae: input streg & 8 pane measure m is the set of 
"optimal" transiations given by 

re) mae{s|ere’ and for ail 8” [s > 8” implies m( 2) s m(s")] 

Note that we restrict our notion of optinalty to those strings which can be actually 
derived from the initial program (s) by repeetee applications of transformations from 
the transformation catalogue (le., 8 3 8 a”). it Is possibie ‘that ‘semantically 
equivalent strings exist which are more optinel but which may not be ‘discovered 
because of some inadequacy in the transformation catalogue. in some sense this 
inadequacy Is Intrinsic since the semantic ‘equivalence: probiem Is in general 
unsolvable [Aho70]. : : 

In our syntactic view of code generation, we “ oer forth two tasks for 
the code generator. Firet, it must produce a ect ot traneiations for si given input 
string that meet certain basic criteria: e9., they. must be welhtormed machine- 
language programs (only these should have the correct attributes needed to 
compute the measure). Second, It must select one of these translations as ine 
translation. This selection ls based on the optimality of the treneiation as well as 


other constraints the user may supply at compile time (e.g., upper bounds on space _ 
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ve. pay bs neta, thoes. cittena ae 


pert of taneforatone te tages en, an 
transformations which reeylt . . # 


automatic sie cee a oan prompt 
us to change our minds. a i Maar iii ee 

Let us take a momint’to Guttine the ‘advantages end dleadvantages of this 
approach to code geNeration. Byeodetng ated Generis ee « serten of simple 
eyntacte pence! ‘We ‘Halve removed the ‘onus ye of ‘epecttying the order rf 
th a ‘are to be brie "we we have removed the control structure 
of the code generator. in a place we require that the designer specify endugh 
context ‘for each tranéformation to guerantee it wil! be Jined.only.hen appropriate. 
The merite of this tradegft ere difficult toWdge... For ama gota of transformations 
Mt simiar to omit the corral structre ae ti poate ts teresen sndestae 


an 7 baplicit. canta 


& gclineiws 


for the tranétorsations On @ given level. thie de Seerre for 


enforcing this. modularity; several ere present 
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advantage of @ syntactic view of code generation le thet the designer is not 
encumbered with the detalles of propra athe nit a ty We etal ote Neher: ore 


neturat level in describing code Gorter tion, The pe . 

current lack of = simple tectahgis for'reiitig & otde generator from the 

 treneformationie. To mcuinty implement « code enereer, we wt have to make 
. re pplied by the context of each 


transformation. Unt tie problem Is solved, "tees esthe largest barter to 
secre nai Saws Sr eal: plititen. 


§1.3.2 The transformation catalogue and metainterpreter 
Since the emphasie in a specification. is on deecribing what the code 
generator le to do rather than how it le to be done, en effort has heen made to 
diatinguish strategy trom mecheniom. The strategic decigioes mada by a code 
(1) expansion of a Jeter semeltegetbte 9 arte: fare 


can be 


(2) snptoatin or etenination of tt statements whose operat! 


(9) Saas Tee ue statemen' — eee : 
The spploabity of & transformation to a particular I. etatement depends on the 
context in whioh that statement. eppears. at nN Se generators the 
context of an operation be eetatilahed by two nterdependent, 
= to determine available saeeschnn, uae-deftation 


‘and fwe variabtae: © 
- © eon pile tiene se re rte of vad i “Yor i wi wk ahd int diate 
results. , 


In a IL/ML specification, these computations have been inocmporatad. as part of the . 
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context matching performed before a transformation is appiled — the designer 
never explicitly Invokes. the underiying. monpanian,. pated: be. may. Ranh. directty 


is SCENE TR oI : 
with values of variables, | execution order of UL statement, etc. = part of an ML 
votim beta tdheeyesicay Oo ont Bl. 


pattern. 


The adequacy Gt Mk. ab the basta “ena generat spain 
ah) EMNIGES YG MARES x) ONE 


tinest tn tas sly fee: pars bavinind anenenmen bs opprene ts Omees 
okt PrieTts 


context. The pattirh pintives provided by Ml Se ‘posed on n standard date flow 


saan hogladoolnevgeocpenniahenoente 


Ara? chy canily a [Bee koe 


ss ta eee gs Seale man te ies 
1 #P: congect. at. all; weld he;.e0.. large a8 to 
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optimal:resuits); fer many of the: w in’ qasation no such 


Neither alternative le completely satisfactory and further research ls needed to 
reach a conctusion. it ageme resnonshia to expaet en. ovnntunl-renclition of thts 
leave and there ls seme evidence’ [Harison] that many euch optimizations may be 
lgnored without significantly degrading the usablity of the specification. In this 
spit, the remainder of the thesis concentrates on the, specication of code 
Generation techniques which have @ basis in flew. emplyyis and tte extensions. 


$1.4 Relation to previous work. 

of code generators: the development of highteve! fanguuges better adapted to the 
writing of code generators and the iiftroduction of an *abstract machine” to further 
Provide as primitives many of tite elementary opdrations weed in code generation 
such as storage and register allocation end tevtotintie’ matingement “ot internal data 
bases {e.0.; the symbol table). The actual: process of code Generation typically fits 
in @ user-provided code’ template with surtdry perameters such as the actual 
? of the operends, etc. Local optimization ts actidmpliehed by special 
“gonstructs within the template which allow testing for given attributes of the 
parameters. Modularity of the code generator ts improyéd and much of the 
machine-dependent tnformation ie in descriptive form. OF tines, the portions of the 
code Generation sigertthm and the optimizatin''inechahiem- wich depend on the 
semantics of the source language or target machine must stil be ooded into 
procedural form. The encoding of thie Wferineton (osu) ‘en opectal cases) 
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The apparent dichotomy between descriptions of the intermediate language 
and the target machine led to disparate re a.for desorbing. gach. The use 
of an abstract machine (AM) capitalizes on this dichotomy. The operations of the 


work 


ees a EL 


ealuielolea il ncaa erimapeecn nel ene tensencar aie ‘A code 


meant 


translated into a sequence of AM operations. ‘end then each aM operation is, In turn, 
expanded into ¢ sequence of target Instructions. The The optimality of the resultant 
code i ergy « tuncto Gt ho cnealy tall end te target ectine correspond 
and how imu 'werk We expended on the expansion. | 3 

introduced to solve the “mn transiator" robles. “ite proponents hoped that the 
use of & conision baes"language would reduce the number of modules needed to 


translate m languages to n machines from win to men; they would translate « 


program in one of the m languages to UNCOL and then translate the UNCOL program 
to one of the n machines. “The *UN® in * "UNCOL" was their undoing as It proved 


#2 Pes paye 


exceedingly dificult to Incorporate all the features of existing languages and 


machines into the primitives of a single language. ‘by teting the scope of the AM 
to class of languages and machines {Coloman, Waite), It was possble to achieve 


truly portable software with a minimum of effort. “Current implemettations fall nto 


two categories: 
Hi. Mes pecatahes.. ey. be cask. , mod Red at 
acoommodate a different machine; however, dus. to the loss of 
oy tp AM cnarations, .it.49. diMouit. to 


; et Re 4y ae 

ey the agiecks Gane Yo eee cetipaall’ te pleauce ey 
optimized..code. for. 9 sparific target mapking- [Righarda).. This. end Je. 
achieved via a “simulation” of the AM operations to gather sufficient : 
SCORER ea ee ee ree a een eee eer 
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Teas 


Lika 


oti free ge any ie Ma tnre, 


eptheization.— Sie ‘esi aie 
haplemant. 


Thus, the designer hed to choces between 
nderandence atthe cust of pur eatin ty 
tn ans 6 publi hte ing in, gampettey 
by = single AM, some reccarchete [ona Wlak] ve weed «more genera 
machine-description factity euch on thet provided by ISP [Ret An ISP deecription 
provides « low-level (Le, register trenefer), Netty. detailed ds 
target machine which le amenable to mechenoe! Inteqwetation to simiate the 
described processor. _lt am ISP description of source language operations te alec 
avaliable, & sophiaticated code ge | Would have suflctent Information to 


eee eyes 


‘complete «translation. Despite the sucess of 16° in : 
[Berbacc!], Ja not realy sufable for decoming the samention of 6 bitrevel 


ese 


1g BS es 


language: oe eas ecm yw ames, - 
many of the cperetore and data types of the language. in — 
senrce of te tant mania on saree png et weet comm 


: WA BUA 
BERS RR TS gh OL : 


covered to he "anno ht mcd ey ie tat nde 


z cae - ee 


context in whith the sontensinel eyaat wipeart & 
thone propartion of the nonterminal eymbol wtieh dave fem. 


fgax SITS: 
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The relationship between the attributes of one aymbcl and another Is specified by 
“semantic rules", agsacieted | with coc preduntion.efising the. synthesized 
the 2 inherited attributes for the. athe alask tauntie an aoe: tighthand! ‘side of the 
production. Dal) prosarte several produaon.syatens augmasted with attributes 
that deapribe raation commonty Collestadsin the, eowse. st optimization: (block 
numbers, whether 4 statement, an be. nesched tang exoeuton eto.). 
principal advantege of auch proguation . syetens leit Seana: ae no. dependencies 
in the. formatisn, on. _ppecific language. or meghine, senentics:— “Hettributée: grammars: 
provide a genere! mechenigm for nccueeting: contextunkintermation during tie frat 
phase of compliation. . However, .optinigations: thet raquiee other: ‘than: a: focal’ 
exemination of context are. herd. tp.sacemmpente: senatrycting: the: appropriate: 
attributes gan__be. nontrivial (pt. Jambde eqlowue -oxample in [Knuth Finatty, 
except in, trivial canes, tranclation inte:atecget mashing: pregeam:: (with ‘ the 
attendant optimizations) ati poene meee ee is highly’ machine. 
dependent. 

Attributes heave, been adopted by. Slearpemes 4n:\nte cntoric ‘on. generalizing the | 
optimization strateging... employed: -by....thi:  BLISG/ tt :sempller:[Mewcomer}. tn 
Performing. the expaneian inte POP-11-codey ele sonpler'depends tnsauly on tables 
which contain bendqompiied internation 90: the best seine ter each. eapansion. 


GE grt 


Newcomer attempts to automate the production of these tables b by exanining « 


“fe Be oY 


description of the target machine. He uses « Ghana sane took za 


thie search (gulded ty, a prefered attete-eat dali enact the machin : 


Ls@ “Hangetatet- ire: 
description) he collects the information needed to conetnutt! the tables, The 


Sane keeles ee oe iene eee 


py FSD 


ibe 


Scherer tpg ia e pha GE ne el aes . - 


serene enans e acmtanas aught san at eat Although the 


generation scheme imoorperating *4 tatty complete tit 
Generel Purpose Optinising (GPO) compiler déviteped at Wil: [Harrteon]. The 
structure of the GPO compiler te stellar to that nti ty thle ‘theete: there is an 
Intermediate language scheme: weed an the interna epresiittation, a set of defining 
procedures that serve as the beste. tor ) 
peeudemachine language, ond « pragrast which mbites the Intra! representation 
as optimizations and expansions are applied. ‘The expansions and optimizations are 
program into machina language, performing regieter assignments, etc. The GPO 
complior Is oriented towards PL/Hike programe — - the primitives provided in the 
intermediate language directly support ‘bieck ote 5 PL/T ‘painter semantics, etc. 
The set of. defining procedures allow telléting of code dependent on attributes of 
the operende. The main differences between the GPO compiter end iL/ML ere 

9 tenilonh Mf eepatatineant acne SenaRNOR RINE: maaeled) 


on the part of the @PO compiter. 
© the syntax of deting procedures of the @PO compiler are best 
@ there ja no notion: of edjecent : into a single 
Operation (as in peaphole optimization). Although Harrieon talks of 
sta basis (le, there is no general pattern 


the attribute Information avelleble throughout the program. L/L 
eo ee 


18. | a OP a Se 


certain situations where the optimizations would not be able to 
aoe eee ne ee 


seeeuinaiy yhere satiety 


The complexity of ‘the “a0. ‘comer ie greatly reduced from that ot erent bids 
optinizieg compilers. [Carter] has hend-elmilated the “expansion of teat cases 
using a set of simple defining procedures for the eubstring operator of PL, 
produoiag code which equals or betters that of the 18M optimizing compler (which 
Includes some 8000 statements to treat special cases of substring). The inclusion 
of more sophisticated optimizations in the processor (ef. [Schatz]) should further 
Improve these statistics. Enoouragingly, many of these resulte seem applicable to 
the fornaliem proposed in this thesis — the increased generality of IL/ML should not 


reduce its performance in this area. 


§1.6 Outline of remaining chepters | 

Chapter 2 is a detailed description of the intermediate language IL: the 
syntax of iL Is defined and the representation of data le discussed: The semantics 
of each IL construct Is described and related to the needs of Mi and the 
metainterpreter. The chapter concludes with brief introduction to the compile- 
time calculation of values. 

Chapter 3 discusses the construction of a transformation from Mi. templates 
that specify its context and effect. The syntax of a template (description of an IL 
program fragment) ls desorbed emphesising the utlity. of wid cards and builtin 
functions. fuies for onplving the. tranctarmation and woahinng te &. pregram:ore. 
given. The final section describes a few sample transformations. 

Chapter 4 presents a set of sample transformations and sinuistes thei 
application by the metainterpreter to a canpie iL program. This detailed example Is 
almed at demonstrating the ease of constructing a traneformation catalogue and 
feasibility of performing code generation using the IL/ML system. 
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The final chapter briefly discusses the metainterpreter and the facilities It 
should provide then summarizes the results of this work and suggests directions for 


further research. 
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Mins language described in ‘thie chapter serves as foundation 
for a ‘specifica tion Gonétructed as outlined in §1.8. tL supports 5 a skeletal ‘semantics 


common to ‘all programs from source to machine lenguege; “this chides primitives to 


MR MSS? ore 


describe the fiow of control and the | dof names and valuse within an IL 


program. in addition, IL noludes a mechaniem for samumapene: Information on 
particular operations and storage cells for later use by. “the ‘transformation 
catalogue and the metainterpreter. The remainder, of the ‘semantics ‘of an IL 
program (e.9., the. meaning of operations) reside in the’ Gensioraation catalogue afd 
are made available when these tranaformetions are fe applied by the the metainterpreter. 
By relegating the language and machine dependence to thé traneformetion 
catalogue ‘and providing a general syntactic machanien for accumulating information, 
IL becomes a suitable Intemediate language for the ‘entire translation process. in 
order to allow common cede generation operations (ow: anaiyels, omplte-time 
calculation of values) to be subsumed by the netainterpreter, separate fieids are 
provided in each iL statement for the information required by the metalnterprater In 
performing Ita analysis, 6 a 
Aithough IL in its. most general form hes a rather skeletal semantics and isa 
suitable intermediate language for a wide ‘variety ‘of source languages, “certain 
conventions are. ‘estabtahed below for use in examples fn later sections. ‘Most of 
these conventions were trepired by ‘conventional sequential, algebraic languages 
such as ALGOL, BUSS, or even CLU that are amenable to ftctont Interpretation by 
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conventional machine erchitectures (i.e., those traditionally thought of as compiled 
languages). These conventions wil be inappropriate in part for compiled languages 
that ere not related to ALQOL (05, LISP); i'many cases these can be easily 
accommodated by relatively simple changes. No direct attention 
the special problems sssocleted with the translation of noe 
control structure difers substantity from that of ALGOL (o,g., SNOBOL, DYNAMO, 
SIMULA, etc.); this omission refiects the bias of thie research towards the 
specification of conventionel code generators. Hopefully, further work will fl this 
gap. 


has been paid to 


The most common form of Intermediate representation fe a flow graph of 
besic blocks where each basic block le described by a directed acyclic graph or 
dag (see, for example, Chapter 12 of [Aho77b)). H. ts a linearization of this 
Ree ee Se en tte Tere ce 
conventional languages. An i statement may specify one of two actions: the 
conditional transfer of contro! to another statement (these correspond to the arcs 
of the flow graph); or the appiication of an operator to its operands (these 
correspond to the interior nodes of a dag), optionally saving the result In a named 
cell. Similarly, an IL statement may have one of two effects: transfer of control or 
the change in the value of one or more celle, Ae we will eee below, It Is easy to 
determine the exact effect of a statement from its syntactic form; targets of 
transfers of control and the set of cele changed by a statement (Its Kill set) are 
syntactically dating.teheme from other portions of en iL statement. 

As mentioned above, I. provides a schematic representation which is flexible 
enough to be used for programe varying in level from source to machine language. 
To encompass such @ variety of ‘programe, iL could not (and does not) ave much In 
the way of built-in semantics. The following Het summarizes the primitive concepts 


22. Chapter Two — The Intermediate language: IL 


of IL: 


© conditional tronater of. ortral to. gontent. hs eaatament_ tn. the 


+ sppiodion on aneaer 0 Se es wane 

Operations supported by ~ the designer must ensure that each 

pester _C8N..0@ leterpreted . by She. eeaat “eatin OF. futher 
panded in the trans catalan. 


+ yt storage proved by named ie: “fhe ‘doope of nce pane. and 


haa cells, ‘Cali ctorqose oy sd wauadrvelye gamantion Someantion, similar 
to BOPL or BLISS. The name of a nae re epplying 


that me euch ing as a Maral han, Uy aha hone 
A Saigo 


The following sections describe each of theee areas.in more detail, discussing how. 
Popular. concents-sugh.as block structure, date types,.eic, are handied by IL. 


§2.2 Date in iL 
All data storage in IL ‘Is provided BY named nr program “variables, 
intermediate. resutts, etc. are represented | In an ra program by a . cell. Each cell ot has 
three components: 
(1) an tvalue (name) which unambiguously identifies the cel. The scope 


structured for modeling arrays, structures, etc. 
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sosbagpetisn cy ie 8 


Note that no automatic translation le provided by thd wathinterreter for cele; the 
cenlgner ie renponitte fer reeting ewch oot Uflted the 1, rogram (by 
Incorporating appropriate: treneformetions We ‘ee ween winie 
shay aciads alocating main storage tfor -progratt PiEte Wa 
chortlived intermediate reeutta), or ‘stibeuiting thi: apd pier 
results computed At compli time et rertaily We 
addressing). | 

MCugh an Yihon wnigny Sates oo, Kb fot nancy 
unique. a giver tel ney’ come’ to-have: tadre’ teh otver me ° rou 

ueudo-cperation Thee 298). ‘Froth then on 

either name mad be ueed interchangeably. ree Sune may also be 
used to Impletient the overlaying of storage, ah 0 dided th Hiaitly’ source 
languages By: eflowing' the equivelenolig of hennee.” Unite FORTRAN: ‘however, each: 
alias must be made explicitly — this ls explored further tn §2.2.2. Note that an. 


fh tedunda nt 


expression “‘imination: or. the ane pee 


value may be re Oe an een ee eee ft will require 
declaration of attributes similar to those for an rvalve (type, jength, value, etc.) — 
care must be taken so as not to confuse ivalve attributes with rvalue attributes 
and vice versa. 

There fe ne: separate cdeierae? for the wouwing of fete {Block structure). 
Through a declaration of a variable of the same name in an inner block, scoping 


allows shielding of a cell from use inside that block. in practice, however,. 
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procedure callie and pointers allow access to cells which are 7 directly 
accessible as operands. Thus the original cell cannot be “forgptten" completely 
we procesers: the lner Dino: A ochentam must be provided. fox, retgrencing 
both the new cell and shielded coll when , describing th 


within the inner block the otaee 


yyy) 


Information - il gl a 


re he 


the metalnterpreter. The additional colts provided for. by, , scope, rules can be 


SRB 


created oy choosing different cot names for ‘each 1 nem, slid of the. varlahie 


Feestniee Jet ROTH: 


: on ayers values, ace 
d types, these. may be 


cae wes 


snoly objects 1 cece, range, 


ees on 
in Practice, aliasing (see's shove). lack of type | OngcKn 
Information. in other wots, ot because the, han bee been declared an nteoe, 


oe EY Pare 2 


pointer does not guarentee that mie to. only Oats of, ype lntager. It Is worth. 
noting here that the meteinterpreter i know @hout certain ae: #f. oblecta, 


such as numbers, allowing tranaformations to anil certain rvalues at compile 


time. 


: a 
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§2.2.1 Attributes 

Atctbutes provide @ general mechaniem for asecclating Information with 
components (celle and statements) of an iL program. ‘Mtributes associated with 
the Wale Gr rvalve of a cel provide Information which le unaffected by I 
operations, ©-6., Ite type, storage olees, size, eto. “Tide information ls initially 
provided by the frat phase of the compiler or added during translation by 
trensformations as it ls “dlecovered." ‘Once ectabllehed, cell attrbutes are 
avaliable from any point in the IL program — namo Information that te context 
dependent (e.g. which register containe the current valve of the cell) cannot be 
stored as an attribute’. ‘Attributes ere the work hores of @ specification: they 
provide a symbol tabla feclity for each declared variable and intermediate resuit, 
model aynthesized and inherted attrtutes weed for pateing contextual information 
about the operation tree, end so on ad /afnitum 

tae stn di ink wi 6 teh os BS 
associated with each stetement. Thia Wotidles ‘properties of the operator (e.9., 
commutativity, shia of a target machine Instruction), affects on the globe! state of 
the thteriteter (e.g, which condition codes ere changed by = target machine 


coeratr, progress wide bn trancatng the statement (wet for communication 
one), ete. By Woerporating these sivaaa ‘at 
Information en attributes, transformations can talor the i program taking into 
conelderation machine; end langwege-dependent features without bulding machine 


and language dependencies into the metainterpreter. 


between a set of 


ft ao as cache ce ec ee 
compile-time computation of rvaiues will propagate thie information es effectively as 
if it were an attribute. Moreover, much of tile type of information is used for 
Optimizations which ere already incorperated in the metainterarater. 
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Attributes are referenced in an IL program as follows: 


“value: attr ibute_name" for value attributes; 
"<hvalue> attr itute_neme” tor rveue at{ribute 


Jen ace 


Each attribute S| a value. (always a literal) .eatablighedin- some tL: statement by 
Including an assignment. to the attribute..nege..in. the. attribute Geld. of thet 


The fest line indicates that the address (Ivalue) of Z is a two byte unsigned 
Integer — this Information wil be needed for,tyne. shectig .pertomed by some 
transformation if Z enters into .a, pointer ,cal 90 


bores § 


lexical level and stack tame. oftest" we ae by-the: fwet phase of the 


compiler or a transformation. applied: cartier) a, wenetennation gould. be, inchived. in 


the trenstormatcs catalogue to compute . sre ectuel fz from this 


cagtocet sy Deel 
Information. _Finalty, the third Wine indicates . that. the vale | of. Z! ocoupies 8. bytes 
and has type real Note that the “deolaration* a r hae po epecial significance 
sity ox Sh ERY & Sahay a € 


in IL; any semantion aasocated with this operator (e.g, aligoation of  etprage or the 
en Ee eee) eee eee ee ot ate catalogue. The 


3 PE ete aS 


eh. in Re _ ther. 


same is true for each of the attrbutes desorbed in thie p 
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vehias ¢ are seely Hterale — ” the sil ty eee ascribed: bd then | in the explanation 


Rhee eS. 


ee een Cine 
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refiects the role they play in. transtormations applied. by the wetainterpreter. 


§2.2.2 Structuring of cell names and values 

‘The ability to structure valves (and thelr corresponding rvaiues) simplifies 
the modeling of aggregate data end operations ‘Which affect’ one or more 
components. Each component ts, in effect; a separite Welle; Its type, size, and 
other attributes can be maintained seperately from those of other components. It 
is also possible to perform operations on the aggregate data as a whole, changing 
all components in one aperation. A components lwaive is consizucted by. appending 
the - mpoiverinte “guliiiaar “$5 the eke of tie" aegreonte, the 0 
eggregete_name.selector. For exemple, fA were an afray dimensioned from 1 to 
10 then 


_ tvalue refers to 


<A> the entire array 
<A>.2 MRT ‘te second component of A 
<A>.<D> Att the Ith | 


CAA Of compontints ‘of A of eedigh A.10) 


Note that <aggregate_name.selector> ie equivalent te <aggregete_name) selector ~ 
either form may be used interchangeably. In the lest ine, “** wee introduced as & 
convenient abbreviation for “all possible component names." Ot course, *** Is 
never actually expanded but rather serves es a wild oard when reeciving attribute 
references to components of an aggregate cell. ‘For exemple, <A>.* would be used 
when referring collectively to elements of the array, as when declaring the type of 
the elements (assuming A is hemogeneous). Thus, fa program contained ‘the 
definition <A>.*:typesbeolean then the attribute reference <A>. 3:type could be 
resolved to “boolean.” <A>." uned 28 the. pray, .08.attrbute reference is not 
equivalent to <AD:. attributes for an aggragete .ace maintained separately from 
those of Its components. The following IL statement Mustrates the attributes which 
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Note that the example enecifies thatthe rvalue-of A le'an aay 10 bytes fong and 
that the ivalue of A:ls = .D:byte uncigned:integer (jest tke ‘any ‘other address!’ 
The third line. is. Included sinna, A:lbeund je:divaly to 'be uted: as.an operard: in 
subscript celpuletions and therefore needs: the:apprpriate: attributes; The Mat fine 
attributes. to be Included in this.array declaration, -every effort tas been made to 
ensure that each quantity which migtt appear es-an operand’in subsequent 


operations has the. required attributes. Thir cininates ‘te ‘need fo¥ any special 
casing — a multiely opecation petformed'dering «-subsortet ealusatien veceWves the 
same. treatment as any auuttiply. ‘Operation. 

on ee i rotation te nore. poweitl thatthe corresponding 
reference <A>.<I>:type (the type of ema 4ast line of the: 
declaration indiaates: that the. type: of -any: component ee “bootean® “and so 
<A>.<l>stype can he Feaged:'to “boolean” -withdut further edo. If,on ‘the other 
hand, sqparate - type. domaitiene -had been “provided For PETl\ ComPmRNTY i— Le., : 
<A>.1:typesbostean, eto, ~ renoltion of, <p.ciitype could’ not ‘proceed without 
more knowledge. of <i> (the: vitiie of the subscript). Even thouyh founds checking 
may be desirable, We te batter accaitaned exptcly et rin’ time rather than 
implicitly during comple-tine type ohecking. Anotiver sohition would be to endow 
the metalnterpreter with special knowledge concerning ‘attributes of array 
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“metainterpreter. Al In afi, the "notation oomes much closer to the semantics 
common to most aggregate data and leads to 0 simple mechanization of attribute 
Operations which effect the tveue of en aggregate: ov fe.g., an array 
assignment to <>) ere understoad to change the rvaides of the components (e.9., 
<A>.1,. <A>.2, ..., CA>,10).. The converse is elso true: a change in a component's 
rvelue changes the rvaive of the aggregate. Both cases are based on the premise 
that the rvalue of an aggregate 4a: the “eun® ef its components = Le., that the 
rvewwe of an aggregate fe net maintained separately from the ‘values of its 
components. Thus <A>: is equivalent to CAD." (when speaking of rvalues — this 
differs from the conalueion reached. above for the ‘managing of attributes), The 
effect of this saeeoning (cee discussion in. §2.8:1-0n sugmentation of “kit sots) 
coincides with common practice: a change in Ad:& should levatidete ‘any temporary 
coples of the whole erray (<A>) but should not:.affect temporary copies of other 
components. {e.g., <A>.7); an the other hand, changes in. the: whole array should 
Invelidate temporary coples of any cemponent. 

As a final example of a structured cell, conelder the following series of IL 
statements (see {2.8.3 for a detailed description of the ALIAS peaudo-operation): 


irsize"2 


| btypesueaigned_snteger 
predhiacictiorase <I>:eize=2 
, dsty d:gizas2 


In this example, the rvakies of | and J overiay i es of X (the designer has 
the responsibility for making the storage allocated for | and J overlay the storage 


for X In the final translation by adding appropriate transformations to the 
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catalogue). Note that although X is not explicitly declared to have any 
components, aliasing | and J to X%.1 and %2. hes: caused them to become 
components of X. Thus, using the reasoning of the preceding paragraph: 

(1) changes to the rvalue of X invalidate the rvaiues of | and J; 


(2) changes to the rvale of | nvaldates tha rvalie of X, but dove not 
mmOeh he alee at ar ae ee Be 


(3) changes to the ‘Welle of J invalidates the: rvatee of Xt dees not 
affect the rvalue of | 


The final two conditions show that | and J are underetod t0 be dont. These 
three conditions ere just the somantios « one » associates wth tn overlayed storage i 


§2.3 The syntax of iL 

An IL program ie a sequence of statements. . maste, up. of olen detséd as 
literals, tvalues (the name of a “cel), or yvehies: (he: ‘application of the contents 
operator: to.an value). Depending on where-a tater appears In an 1 statement; it 
ls further Classified: asa iabel, operatet, ‘operand; or atwibute. ‘Label ‘tokerie’ must 
be: lvakves;:' ‘ operator ond attribute: tokens’ art aheays ‘Merule; “operand tRene may 
ba any. flavor. Beyond the ‘semantics ‘asauelated With these’ four classes: of ‘tokens, 
IL provides ‘no further interpretation of ordinary ‘sdkene: tn thie serwe;tl te cimilar 
toa BNF: neither provides any: interpretation: of the-spmbole' of the language. 
Special tokens are provided to indicate tranutirs ‘of duntretiand their cerrespundmny 
targete within-an IL program. These-tekens: ere teed in dita flow analyse and ere’ 


seghe 


tiga tis ALG) Gamteman’ esugh tha’ ae OF Wesedeuten Gases (eeu 66.7) 
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operator Thle eid indicates the operation performed by thie statement 


operand... Zero or more operanda used ea arguments to the preceding opera ation. 


A ct of zero or mere “namenuahwe" pala further describing t 
: context and semantics of the statement. 


Sey een car ECON IO ACN 
es 3 } alee { X=3; Ye2 }; 
her no single Il saisc ab ugdlce 1k cb Gs sh aoa cava caue ane 


the definition of C1 and C2. entirely from Flgwee-2.1.and wee the terals "2" and 


the first phase of the comolier. Attributes are-desesibed in were. datell in §2.2.1.. 

n the description which follows, it wit be useful characterize tokens as 
either Meares (or ret erences. (eitver: = value or rvalue). By way of example, 
consider the following two fives from Figure 2 = 
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Label Attributes 


declaration X:typezinteger X:slze=2 
<X>:typezinteger <X>:size=2 
Y:typezinteger Y:size=2 
<Y>:typeszinteger <Y>:size=2 
Z:typezinteger Z:silze=2 
<Z>:type#integer <Z>:size=2 
<C1>:typezinteger 

| <C2>:type=integer 


declaration 


declaration 


constant "2" 
constant iat ¢ ha 
greater_than <X> <Y> 
if_goto <T1> L2 L1 
label L1 

store <C2> 

store <C1> 

goto L3 

label L2 

store <C1> 

store <C2> 

label L3 

add 
store 


Nie<xeixxeds 


Figure 2.1: initial IL representation 


Attributes 
T100 | equal <X>:type “integer” 
T1 add <X> <Y> 


The Italicized tokens are literals; the rest, references. In IL, literals are nothing 


more than character strings — Interpretation of these strings Is provided by the 
transformation catalogue and the metainterpreter. References "refer" to values 
established by other statements — they provide a level of indirection. The principal 
difference between literals and references Is that the meaning of a literal can be 
established at compile time whereas references often refer to values that are not 
known until execution time. Literals are of central importance during optimization 
since thelr fixed semantics provide opportunities for Sorpile time evaluation of 
operations. Some references (e.g., <X>:type) may, depending on the context In 


which they appear, refer to Ilterals; in these cases it Is advantageous to remove 
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the unkecessery level of indirection at compile time. ui" Vaid’ reference as 
cannot be resuived inte 'Mtérets €t campile time (e.g. 20), will te ss 
nn pe en ae nie ee ae ng lee 
pee ae Ceene Se some Shin. eid Be: aeered 
nae : 

 §2.8.1 


perform these optinizations, This suggests two formats for the label field: "K" and 
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“K,D" where K Is the kill set of the statement and D the corresponding defined set 
(D © kK). When the ebbreviated firat formet.te.uged,.D ie caleulated.ae followa: 


oase 1. if K ls empty (iK{'*-0) ten'D =. °° 

case 2. HK hes a single element (kj = 1) then D = K 
, case 3. <1 KK) FF thefO eg - 
* case 4. die YS recsideaaas 


Concidering only statements that affpct at,moat one, ceil (all, the statem 
Figure 2.1 fall into this category), there.is a-n@tural integpretation for each of the 
above cases. Statements. affecting no celts (e.g, trenstere of control). are. covered 
by case 1. Statements whose operators have. en. semantics (add, 
multiply, eto,) fail under case 2; the single, clement of the kil set is.the value. of. 
the cell where the result le stored. The apecitied. coll ls elways changed by. 
statements..which always change. the same call.(i.e,..they. do. not comaute..its. 
Ivalue) — in these statements the label Ia .epsentially..gnother operand. Case.3. 
ent that compute. the, lyglue..of: the. aell.in. which the 
result Is to be placed, @.g.,.aselgnments through pointera.or. to. array. elementa. with. 
non-constant subscripts. Here,.each cell ink nas bean. billed (ite previous, rvalue 
may have been changed, thus it can no longer be assumed that it Is available) 
however no ceil. in K has. been detyed (nosingle cell: ja, certela to have been 
changed) hence D = ¢. In the final.case, a.label-of “"".indicates that all colls 
might be affected by executing the statement.. For. essentially the same seasons | 
diven in $2.2, no provision hes been made fer specializing "*" by specifie..cell 
attributes (e.g., type): In..aimost. every language thare:.exist loopholes. which make 
attribute information unreliebie!. This label is used, when the statement .has.. 


io -eramnsngion of obacations. inthe By 
velueg, this is the semantics 


provided by many languages end relied upon by programmers to olroumvent certain 
language restrictions. . a ere ee eee eS pnewe: 408. 
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unfathomable side-effects, for example, when the labet fisid contains too complex 
en expression (0.9. deeply nested contents. ‘epasetore) ~ when an tvalue 
subexpression has become unwieldly R ” enwoue. saga! te gooune ite value is "** 
and proceed from there. This overly conservative interpretation may result in 
missed optinization opportutities but never in an incorrect’ translation. 

Procedure calle havé the potential of affecting Many cellé and so do not fall 
s which form the 


Into the categories discusded above. The sequence of stutemen 
body of the proceduré wey kit end define celle — taken in thie aggregate it is 
possible that K 2 D ¢ ¢. In addition, procedures that return a value add yet another 
element to D (the cell containing the retumed value): The second label format, 
"K,D", le used for procedure calle. White It is theoretically poesible to compute the 
appropridte label by exeminiig the body of the pracedure, this calculation quickly 
becomes unwieldly. A reagondtle aitemative is t astign procedure calle the label 
WR" where R le the Walue of the cel in whlch the retuned ‘value tif any) Is 
stored. Thus the semantice of a procedure ocall’‘is ‘reduced to invalidating 
previously calculated values for all celle excapt the one contdining the return 
vatue. . | 

As was outtined in §2.2.2, it le occasionally necéseary to augment the kill set 
of a statement t account for the sementice of aijyregats oslis. Although the size 
of the ki! set may be increased, thé defined set céltiilated above remaitis 
unchanged — essentially no new cells are being atided to the kill set, but only other 
Ivalues for the affected rvaiue(s): The objective of augmenting the kill set (s to 
explicitly include the Ivelue of every cell which is affected by the statement; this 


reduces the amount of computation performed: by the meteinterpreter when using 
optimizations, as is would lead to incorrectly transformed programs — only the 
programmer |s allowed to play havoc with his program! 
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hath oe ote ieee ANI ae eS a aR ES NEE oo ge rR ae Rie ee IN og eat od eR Rok L, seqeelro chs baa 
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the kill set. 

The following algorithm conetrusts an sone kil set K’ from the original 
kill set K. K’ will include aij. velues ‘ALIASed to values in K as well as the values 
of aggregates which subsume Ivalues in K. In conetrupting 0, distinction Is mada, 
between an aggregate ee tn ee eee ee arene it 
refers to the aggregate treated as a single Vaue (1.6. ily tempor wary ‘copies of the 
etitire aggregate or ital LB comidhca ‘ot an aggregate's 
wt te ay tne of he ay A it vs Hi inte un . 


“A** would invalidate any componente’ - subcon 
aigorthm 8 ae 
1. Initially KT = K. 
(2. For sach stuotired hal in Rnd 0 18 “an Walue Is 


8. For each =a aes nto som jared 
“Saas pm tet NS pe 

_-an element of K;, thie step wauld ogg A1.2, A.1, 99d, Ata K. 
. 6. Repeat steps 3 and 4 until no.more additions are made. to K. 
mented kilt net ier tie tatomnet. The follawing . 
exhibition, duplicate Ivelues (@.g., X." end. X.1) have been removed from the. kill 
sets. The examples sssume Sines examples, jn. 52.2.2, 


The final result for K’ ts the augm 


hilt sot et (K ) , re (K) 


wn eee 


{a} ae eae 
{A.3 A.4} {A.3 A4 A) 
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2.3.2 The operator and eperend fnide 

No particular sementica is attached to the, operates field of a statement. 
The meaning of an operator is established by transformations. which expand it into. 
other IL or target machine operations. A useful. analogy for en iL operator ie a 
macro — the tbody of the mecro definas the effact of en apersior in terms of other, 
usually simpler, operations. If the effect of the macro can be accomplished directly 
by the terget machine no further refhement of the operation is necessary; the 


case a sequerice of Il operationa) souk be. iki ast sd ies dhe meding 
the appropriate substitutions of actual operands for. formal pacameters. of the 
macro. If each expeneion is subject to iater optimization, It is possible to use 
general definitions for each macro Operation, Le., definitions euch as one would find 
in an interpreter. Special cases that hinge on perticules values of the operands 
would be expiicttly tested for in the substituted sequence; later optimization would 
eliminate thoee operations which could be performed et casipile tine. for example 
(see Figure 2.2), the expansion of the addition operator might test the type of its 
Operands and then perform an integer or fioating polit addition as appropriate. if 
the type of the operands could be established et compile tise, this test would be 
subsumed during optimization. Although It le Hot necessary, use of genera! 
definitions greatiy simplifies the top level of a spectioation es there will be only 
one transformation for an operation nether then one for apch special case. 
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[Label | Operator Operands | 
a a. wae DFr pan hare cee ie, 4 4 


ca 


Boge : 


-. €T4O2>.. <¥> 
‘L7 
<Y>:type “real* 
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Operands serve on arpmant t the preceding argon and may be ey o 
the following: 


a literal. Literals are.encicned sin: quetes when they appear in the 
ee 


a reference expression: 4 chints hve’ "Yn, “abtiied” of the coll 
aga aa haa a ae ae 
There ls 10 @ priori restriction on the complexity of a reference expression, but 
more than one level of indirection (eentente @perator) Wit: Skely have to be 
calculated in @ separate statement. By convention, ot met a ange level of 


§2.3.3 The END end ALIAS peaude-operetions . 

Pseudo-operations provide a mechanten "for Intormah ne: ‘woteinterpreter 
about Information difficult (or impossible) to iderive trol the i: program. IL 
statements with peeude-operatore are “visibie® to the, ‘transtormations which may 
traneform them into ordinary ML statements, etc. but they beoone eae: in the 
final translation (Lp., they. site. caot Output in” the remit ia tareet 
| Tha names choven for eeudo-operatine are reserved eho wot be vee for 
other purposes by the designer; i this theets;-pesudo-operators Wil be @eplayed in 
upper case and all other:aperators dieplayed th lower case. 

The statement in which the END pseudc-operation appears marks. the logical 
end of an il statement sequence - flow enalyela for that sequence will not 
proceed past thie statement. Statements following this statement up to the next 
target statement (see 2.4) are considered inaccessibie and will be removed. by 
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the metainterpreter. The END pseudo-operation Is intended for use at the end of 
the IL program end for marking the end of procedure bodies withir the JL program; 
presumably some transformation wilt: transiate-it:into a sedt-or return as appropriate. 
This operation makes no use of the. label, operand; or: attribute felis and so may be 
used as the operator.of a target statement. — « . 

The. ALIAS pseudo-operation: provides the capahility of defining equivalence 
claeses of values — eny member of an equivalence’ ciass:refers to the same rvaiue 
(although each member may hawe. different attributes. associated with it). This 
operation ie used to indioste sharing of rvalves (avarlaying of storage) es declared 
by. the source Janguage . program: (e.g., with the -FORTRAM: EQUIVALENCE: ‘stetement) 
oF as determined in some tsaneformation (e.g.; when usadito indicate that two celis 
held the same value; thie. typically ovcurs during optimization: when: a sequence of 
statements bole down to a. move from one: cet. to. e ‘temporary ~ the ALIAS 
operation would indicate that the temporary: is. ahased: with the Original cell). in the 
latter case, the ALIAS cperation provides a renaming capability. to “the 
transformation:designer. The form:of the ALIAS statement is. 


which causes the metsinterpreter to plage fvalue, in the game name equivalence 
class as /value,. Note that, by definition, ALIAS is a transitive. operation. Typically 


ivalue, is the new Ivalue to be defined and attributes... are ite initial attributes. 


{2.4 Flow of control In an IL program 

in the previous sections, nn -eyukass cel eels aeies chs enone i. statement 
were described; thls section. ‘describes, tie: semantics “a! sequence of 
statements. IL statements are executed sequentially; made: explicit tranefers of: 
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to IL ere similar te those provided: at the machine level... Sequential execution is 
also compatible with a wide variety of tenguages, eapectaty thage ‘that have 
relatively severe ordering constraints (e.g, ALQGL, which epeoifies strict left-to- 
right eveluation of esqpressions). This controt structure te wore constraining than 
by dege is thatthe sone (cperands) ‘of an interior: node (operation) must be 
evaluated before the. node ces be evaluated. “Come dangquages:(e.g:, BLICS) take 
advantage of thie faxibility in expression evaleution: by only: mipesing evaluation 
order constraints.en certain operaters: (such as: BEGIN:.END). Such Nexibiity ls ‘not 
inherent in an IL program and must be provided by the transformation catalogue and 
the metainterpreter: trensfermetions can cheage the cndir :of - statements in: an iL 
| program’. | 
As was mentioned et the beginning. of this chapter, the syntactic 
conventions diecussed below are- net particularly appropriate: for langueges whose 
control structure differs subetantielly. from that’ wove. SNOBOL, for example, 
requires a “transfer of control" with every statement ~ the difficulty in 
accommodating this construct In IL reflects the difficufties In producing a SNOBOL 
comple for oonveritional mantinge; sérhene when ‘the letter problem hes been 
solved, the solution can be incorporated in IL. | 


T In general these transformationa only change the evaluation order to achieve 
some goal, for exampie,.a reduction in the number ‘of vegietwers required: to evaluate 
the operator. in thie wey the conditions under which evaluation order can be 
modified. and what metrics. are: weed to judg tu: result are Wide explelt inthe’ 
transformation catalogue. The information would be useful during the analysis 
Phase of @ metacomplier attempting to construct "a ceme'Yenerator from the 
specification 
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IL statements which cause a transfer of control (transfer statements) are 
readily identified: they. have a ** In. thelr lab@l. feild. Note. thet this use of the 
label fleld prevents the ‘statesient from aie | ca 


tid ‘and gaving) a value, it can 
only effect a transfer of control. Procedure calls. are hanged differently: since 
contro! returns to the statement. following -the- procedure cali; they. are similar to 
ordinary statements-exeapt.for the.peselbla skiareffeata.ct the procedure body. in 


§2.3.1, a convention fer. the jabel field far .nrmcedure calle wan ‘established : (listing 
the side effects of the pronedure);..thus, no:-tranater is .explboitly indicated. 


Procedures are treated..9a "complex! operations:;in :e0.. far: .as:.this section ts 
concerned. Note that @.trenafer statement aiweye transfers. control; itexeaution 
can conditionally. contlaue with the set: statement, it-.must. be. provided: for 
explicitly by adding aa. additional lebpl:etetement.. > | oo 

IL statements .which ere tergete, for .¢ transfer of; -contral (target 
statements) are idaotifiee by placing nto” in. thele Tabak: fieid, “Ae: for. tranafer 
statements, target statemants canast.compute (end eave) value since: theiriabel 
flald has been presented. The following. nomeantion le: used: bythe ateteinterpreter 
for, determining which. target-ctetements ace! pesainie targats fora given trenster 
statement: 

a target statement is a target for a given transfer statement iff the 


same value appears as some operamiief both: the: tasget-atatement 
BNE The Tanerar:eetenent 


Erie t 


This convention allows ‘additional ‘erguments. to ‘transtor and target statements 
which can be. used by the operator of these statomente, The following ecamels 


eb pe 


(extracted from Figure 2.20) iaitatne the convention more © clearly: 


% WR? s 
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The Gret tine te 2 Wancfer ‘statement (new. tne tebel fale) witch can transfer 
control to either of the terget statements. The sewnd étatemetit ie recognized as 
« possible target sinwe the tulue 11:te an Spurant Of bUth the fet and second 
statement; sinlier: reasoning holds: Yor: the leet target etatdment. it te not possible 
to tell from the sbove pregram. the clowetances wider Which efther label’ is 
the value of T100), information thet only exiete ty Gie:trarnelonnetion Gatalogue. — 
| The Information te weed by the metuinterprowr teicenetiict a “maximal” flow 
graph for the tL program. The ftow gregh ts anantital Wi the seh that ‘an poustbie 
targets are. considered foreach transfer stetenant, eve ‘those: which may’ be 
ruled out by the sementios of the wperutor wf Whe tWinefer wurtument.-Thie graph 
carves as the basis for the Sow anclyels pertomiad-by the’ metainterpreter and is 
updated whenever « transformation changes or eliminates a transfer statement: : 


[2.6 Compiie-time eatoulation of rvaiues 

. One of the goals for the ayatax of K le to alow the comple-tine oalouation 
of rvaiues. The section briefly touches on the reestution of rvaiues (and Ivalues) 
using the notation developed in sarfer sections. in thie section, set notation Is 
used to indicate posetble values for a reference expression, 6.g., if the rvalue of | 
le known to be elther 8 or 4 then we write <> = {3 4}. if the value of a 
reference expression is unknown (i.e., it could be any possible value) then we 
write {*}. ? 


\ 
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Occasionally, it Is possible to further rescive a ‘particular seferouce 

expreasion. If <I> = {3 4} then cae A ees 
CAD.<1> © €Ad.{8 4} © {DB KANE). 
If, on the other hand, the value of | is unknown (Stdom: {*}}then tere HE 
| aKDes <AD.8P MEADE. at 

IL recognizes. the .alternative foes in: eacts eoonmple. as: equivalent: “in effect, such 
resolution Is performed-automatically. Even in: the abedhive of hnowledge about ‘the 
rvaiue of |, a reasonable:. interpretation of tvalues incorperating <i> Is: possible; 


erring only in that it is Ukely to be an overly conservative interpretation. In the 


second example above, the distinction between "*" as an abbreviation for all 
possible component names and {*} as the representation for all possible valen has 
been deliberately blurred. The Intent behind assigning numeric selectors for the 
components of the array A Is to allow this sort of felicitous confusion. 

As ‘ rule of thumb, the utility of the compile-time computation of a cell's 
rvalue is inversely proportional to the size of the value set. There are several 
contributing factors: as the size of the value set increases, it becomes 
Increasingly unlikely that any significant optimizations will be possible for rvalue 
operations on that ceil. In addition, uncertainty in one cel’s rvalue tends to 
propagate to other cells whenever the first cell is used as an operand (the value 
set of an operation is proportional to the product of the value sets of the 
operands). Such “dilution” of compile-time information is not unexpected — It would 
be unreasonable to expect to perform all computations at compile time! However, 
the prognosis at this point le not encouraging: it would appear that large amounts 
of compile-time information could be collected with little ‘prospect of a 


corresponding gain in the optimality of the resulting translation. 
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of rvalues is subject to the jew of siminisiing rateras, emi therefore rvalues are 
not suitable for cell atiritanes thet de sat change with cook operation on the cell. 
The first observation serves as. further @tivetion: for the introduction of {*} for 
attributes exp. a useful: addition to the semantios of :a-aal because they provide a 
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$3.1 The transformation catalogue 

A major design géal for the IL/MIL system was to keep knowledge about the 
source language and target ‘machine separate ‘ole omnerat "knowledges ‘about Gide. 
generation. This was dcodinpitahad ‘by ‘providing tor tor @ separate description 0 of 
machine- and languagé-dependent semantics — - the ‘embodiment of this description is 
the transformation catalogue. ” Bach piece “of” language ‘or a machineapecitic 
information is expressed as a ‘syntactic transformation of ‘an iL progran neaeen’ 
after ‘the’ transformation has been | applied, ‘the uedated ‘eroren will have been 
modified to incorporate this ‘new information in” tarms “the “natanierater 
understands: as attributes or a new. ‘sequence of C statements. The 
metaitterpreter ‘provides the remainder of ‘the framework needed to finish the | task 
of code generation: whenever it exhausts its analyais of the ‘current program it 
returns to the transformation catalogue ‘to gather additional tntarmation (o the fire 
of a "new" IL program to analyze). “The ‘ycle of  anaiyele and ‘tranaformation 
repeats until the translation ls ‘complete. cal _ 

This chapter discusses the transformation catalogue and the language which 
serves as its basis: a metslanguage (Mi) for decorioing w program fragments. 
Using ML, the designer can write templates which, ‘describe the class of tL 
statements in which he Is interested. This class can be ait large ‘legs Mall IL 
statements which have commutative ceeeees)” or quite small (e.g., “only 


statements which apply tha aine operator be the 1° 


he re 


3: 14169") depending on. 
‘ ‘ot’ thie ‘claus of iL fragments. 


the application the designer Has in mitid, ” 
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described by a template are said to match the template. §3.2 presents a detailed 
description of the syntax of ML. 

Two templates are Incorporatéd in each transformation: one as a pattern, the 
other as 4 replacement. The pattem specifies the context of the transformation es 


a set of program fragments on which the trenet gan operate’, IL 


statement(s) which match the pattem become oangidates, for the aodicatins 
specified by tre. replacement. The replac mm , Perhaps using, statements or 
fragment to be subettvted forthe matched fragnent, 


a new IL program 


The use of transformations le a wel-establiched technique for embodying 
rnowiedge for ater wee In mechanized fashion (soe $1.8.1), 4 afl the. contextual 
Information in the data base (in thie case, the K. program) ip avaliable in syntactic 
form, patterne provide @ concise description of where the plece of information 
captured by the transformation is applicatie. _ Using. the anetormation catalogue | Is 
reduced to ) finding a transformation which matches the given HL statement (or any IL 
statement, if the napchageadeatderal has no Sghtiniead geal in mind);. alternatively, the 
replacement (which bs also a pattern) | cen be examined fo determine if it 
accomplishes the desired effect. The ability to use transformations from either end 
enhances their utility as the basie for knowledge representation. Nef 

| $8.8 describes how transformations are constructed and how they ere used 
by the motaliterpreter. The final section of thie chepter , bresents a series of 
annotated exemple transformations. 


+ This context can be further modified by a set of conditions specifying 
aetat which are not. expressible in terms of. the. aya of the IL. program 
(see §3.3.1). , 
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§3.2 ML: a language for describing IL program fragments 

ML is simar to. other matalanguages.— -its ayntax subsumes: that of it (Le.; 
an IL statement ie a legal. ML statement) end; in: addition;: it allows: certein 
metasymbois to replace. IL components OF. statements: The. matasymbols: come in 
two flavors: wild cards that act. ae "don't cares” in-the: matching grocess, and calls 
to. bulltin functions .that allow access to apme.et.the metajateeprater's knowledge 
of IL program semantics. Use of these matasymboles permits the, designer to write 
generalized IL program fragments; these fragmente..are mora general: then an iL 
components in which he Je interested. (using -wiki.cande-to specify the remaining 
components), . : 

However, tha degigner can only generalize -aleng..certein:dimensions as his 
only access to the meaning of an IL. statement. ia: ite syntactic) form end: whatever 
bultt-in functions are available (see §3.2.2). Since the separate ata for kill sets 
and attributes in an IL aetene seem to be as as one can go towards making 
the syntactic form of an iL statement reflect | tne a statements semantics ‘without 


Eee) 


limiting the generality of iL, the Waiting factors | pail othe apatites of the built-in 


OS Bes 4 fn aia eis ont ee eee 
functions. “The designer can determine whether two Wterals are the same but may 
Piet gti 


not be able to find out, for example, whether the aquare root of « literal ls an 
integer. These restrictions on the abilities of bultt-in functions are the most severe 
limitation of ML: buliding In language- and machine-specific predicates into Me is 
ruled out as this effects the Generality of the-eyetem ‘hd, ‘unfortunately, It would 
be Impossible to include all the Yeneralty ubetittunctions.” test we be accused of 


making a mountain out of a molehili, it shonid be pointed. out thatthe result of these 
‘imitations, is missed. optimization. opportunitigs.:.. hreewmably. all the. computations. 
specified in the IL program cavid be done at -expoutiqn. time; the: computationel - 
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facilities. provided by ML are intended to «alow »épesial: tailoring of the 
treneformations and not'ts be on-casentia: componbat-ef' the transformations. ML 
takes the middie seed by providing tullt-in’fulatinnt ter maniptiation of literals and 
for interpreting Nterals- ae ‘numeri Quantities — ated findtiné must be constructed 
from: these by tnoluding We apprepriate tranetembetione’in the catalogue. These 
additions to the cataingue: are euMieient' for moet parpowts — for example, ‘the 
Catalogue may contain transformation’ for simplifying ‘the ‘pplicatiin of the 
transcendental functions to certain arguments’ {e, 0/2, ett.) bet would transiate ail 
ather appilaations t a run-time calt of the apsrepriate furtction. 

§3.2.1 describes wild cards; §3.2.2 enumerates some example built-in 
functions. .Exemple Mi statements can ee found 6 the lest-section of the chapter 
as pattems and replacements bi transformations, 


9.2.1 “Wild cards 

“Wid card metacymbols ere used eo componente of an ML statement 
wheraver a epecific IL component woul! be too reatrictive - the wild card will 
match any Ik component(a). _ The Gaoussion below deseres the meaning of ML 
statements when used in a pattern; “10 4 large degree the somantice of 
replacement are siiar (Giferences are described i $8.8.2). There ere four forme 
of wild card: | 


te en or e% 1 " 26s ; oa } we i & $ i és : es a & 
curds used'in « single pattem: 6 replacement. Thess saaléé “are also used in the 
replacement to refer to componente or statements matched in the pattern. If a 
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given wild card appears more than once In a pattern. or replacement (l.e., two or 
more wild cards with the same form and name) they ere understood to represent 
the same IL component; if- this- ‘duplication occurs wittin’# Twatyerr: then all the 
copies must match it: components with the same, ‘renresentittion. 

The 7 and $ wild cards pateh.a aingle, non-null component or statement 
respectively, i.e., for each ? (8) there. ‘must .be a | 


AL component 


(statement) in the IL | program fragment which la being. rst. Note that when 


wild carde — or the match wil fall. Thue sae ae abel. falc 
amponent sunt oe. eg fot conan ofthe. pogatr 
(use 7 wild card) and operand (use 7" wild card).fielda, | 

| The. 7 wid card matohes. any . sequence of. zerp or more. IL _companents 
within a single field — what. components, are. , matched, paul, iepen 
components on alther side of the 7* wild card.in the ML statement. If these 


in the pattern, wild card componer 


adjacent components constrain the match fot, the. ca wid card to a single 


sequence, the 7* wild card is sald to be unas Rg in general, if more than one 
2" wild card is used in a single field, lege thie le always the 
case if two ?* wild cards are. adjacent or oopareted by any number of ? — 
cards. Even if specie i components are interposed, uptontion of this component 
in the IL fleld can cause the is wild oerde to see ‘ambiguous “For eneeels, vanalier 
the sequence of components ee B Cc c b". ee are we ‘ways In which 
components can be sasioned to the ML expression "7x ty c wz: 
axetAt myeB" Pzenc D* or or Tent A ohn ce eae". 
pabiodous wild cards are useful for matehing a aeactte 8 ‘colipenent giana Ina 


f 
field; e.g., the folowing ML statement matches any add statement which has at 
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if “add” ts a binary Operator, one of Tepet and Maped wit be aosigned no 
compcmante tat’ Wiki Weaedh. The Salbitanne wild Sasd ities’ tie ire 
| reditonel weet vant i cst ts 8 it tt for inka vetication, 
in the replacement. — | 
The @* wid curd matunes ‘a sequence of zero or more H. statements. Unlike 
7" however, the suqienos te not Geterminad by teiiéal hiitaposition in the IL 
Seried adjacent In the process 
of matching if one might fotow the other in execiition. Sfaiches and joins in the 


program but by flow of control: statements are coldh 


flow of ‘contro! often result in more than one pebaiiite atquince of statements that 
could match ea $* wid card. For anion, Giana eS. pempriin gtven tn Figure 
2.1 and the Setiouing snquation ot Wi: atutiment: . 


Cicise’'n 1 Conese! Aha: tints Goudie Semakdnes of 'W: chuaatsanks tak -ockld be 
matched by $*A. in ouch cases, both sequences are saved ax possible values tor 
S*A. ‘The most common use of 6" wild cerda (and the sete of statement sequences 
that they match) ls to eetablsh the context of « treneformation — there exist 
bult-in functions that test these sequences for simple properties (e.0. presence of 
« oven value n the label Meld of at east one statement tone of the sequences) 
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Figure 3.1: Matches for $*A from Figure 2.1 


§3.2.2 Bulit-in functions 
- Bullvin functions are used in ML statements: to perform operations that 
require more power than simply rearranging en iL. atement. A cali on a bullt-in 
function has. the following form: 
fonction, af uments rar gument 
The use a square brackets distingulehes buliein feriotion calla from ordinary iL 
components (which are restricted to the use ot parentheses). a functions return 
@ result (no side effects are possible); this recut can be used as , the argument to 
another built-in function or, if the call was part of a replacement, become part of an 
IL program. The arguments to ea function may be written as. elther iL or. ML 


components but they must be able to be resolved by the metéinterpreter to a 
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particular iL component (or h. statement sequence for verte functions) in the 
process of applying the function to Ite erguments,:the ¢enetion may abort causing 
the application of the transformation to fell regerdiees-of' the location of the 
function call (patter, replacement, or conditns) The matt reason for aborting 
function is an inappropriate argument, e.g., the ‘erquaem tes the wrong type, 
cannot be rescived to a literal, etc. For Instance, the add function aborts if both 


Sperende:ers nwt merae: thet cen De Seer sen 2 ee eee: 


4 8S 
vat 


By way of exemple, several functions are below; this list Is not 
meant to be complete = ‘only @ sampling of each category of function have been 
described. It le expected that an implementation would expand the let; the only 
criterion for Including tunction is that It agt cater to a. specific language or 


machine. The folowing argument types are used in describing functions: 


component Any 1L component fe an soveptebie argument. 

literal The argument must be an iL Mteral (Le., an 
operator, attribute reference, Or operand 
enclosed in quotes). ig Saban 

number The argument must be en: J. diteral which can be 


interpreted as a number (i.e., it contains only 
digits, a decinal pelt; end weigh. —- 


boolean The argument must be one of the fl feral 
"true" or “faise". 


sequence “The argument must be the result of a $* wid 
card meteh Ge, asset of  -etatement: 
eequencee). 


if the supplied argument does not have the correct type, ‘the pete rerpee ter wiil 
abort the application of the function and hence the application of the 
transformation in wiich it appears. 


and{ boolean, boolean] 
or[ boolean,bootean) . 
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not[ boolean] 
the standard boolean functions evaluating to the literals "true" or 
“false” as appropriate. These ere .ceed.:meet. often >in: conjunction 
with other functions to form more an crieta eaaisomnais 


equit{/tteral jiteral] 
ceisapesa to Sibcale\eae/ i ime Nw we alike eeoeaation 
evaluates to “true" if they do, "faise" otherwise. Note that equal 
cannot be used.to compere two erbitrary ‘il: -compenente: —:thie can 
usually be accomplished directly In the pattern old ae the same 
_ witd card name in bath component locations. . 


.constant[ component} 


ene to "true" if the argument is a Meera, tales" otherwise. . 


Nubial component] 
evaluates to true If the argument represent. valid lvalue. 


label[/abel, sequence) 

evaluates to “true" If any member of the augmented kill set 
represented by ./ebe/..appears. in.: the iabel: feld.of:. a: etetement 
contained in the set of IL statement sequences sequence. This 
function. determines. whether a ceiife) hes:-neert: modided: in an It 
statement. sequence, The .label -tumation:, ls representative of 
functions that search. ii. statement sequences for: simple: properties; 
other functions that test for properties in — — and search 
other statement fisids shold te inokuded.. 


add[number number] 

subtract{number,aumber] 

multiply[number number] 

divide[number number] 
the. standerd arithmetic functions returning. the appropriate numeric 
’ Hiteral. In order to avold representation problesws;::8: iat Umit 
may be set by the mapromente tot: 


power_ of_two[ number] 

evaluates to “true” if the argument Is a numeric Iiteral which is a 

 power.of . twa,.."“false” - etherwise. .tle fmotion >le weeful for 
determining when to change multiplications and divisions into shifts. 
This example repreapnte. the -tip . ot the: ipebeeg: wtien it comes to 
useful arithmetic functions — a reasonable subeest might be to Include 
only operations on. binary. representations: <binesy . log, “logical and. 
arithmetic shifts, etc.). 


Choices of the domain (arguments for which the function will not abort) for the 
predicates described above have been made arbitrarily. All that really matters ie 


that the choices are consistent with the use Gf the functions In the. transformation 


catalogue. 
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- §3.3 Tranetormations and pattern matching 
A transformation le made we of three componenta: a pence a replacement, 
end a set of conditions. The pattern (an Mil program rayeient) 
(« set of predicates) eetabiiah the context of the wanat rn tee 
those IL program fragments on which the trenefermation can operate. 2 abiRiquoes 
group af statements within the pattern is designated as the target — these ; 
statements must be contiguous as they will be replaced in their entirety by the 
new IL program fragment ognatructed from thé replacement once the context has 
been verified’. | 
The following ortterta: must be mat before @ taneformation can be ‘applied: 
(1) alt companents.of the pattern must must matoh @aine component in the IL 
poagrem tragaient toot viel’ wiiega): ap ite | wild Cards must have 
matched It compunante Sith the name FauiiGentation. 
(2) each of the conditions miet eughedte: te tide: Ht ity condition aborts 
(see §3.2.2), the application of the transformation tails. _ Note that 
argument; theee wid cards will be replaced by: the fi “shel 
they matched during (1) before evaluation ef the function.” 


(2) the target must be « contigueus group of statements from: the 
matched iL pragram fragment. - 


(4) the replacement must be successfully constructed — each inline 
bulit-in function celi must be evaluated without aborting. 


if all these corfteria are met, the newly construsted replacement le substituted for 
the target, completing the appieation of the traneformation: ue 
The folowing seotien describes the eymtax of « traneformation In more 


detail; §3.3.2 outlines how the replacement is conatructed. 


t Statement sequences: matched by $* wild cerds. oannet,. be. .genearal,.be used in a 
target since’ they do not hacecenty contain lexically adjqnent statements. For 
similiar reasons, $* wild cards are seldom used in the apoctiration of a 
replacement. 
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§3.3.1 The syntax of a transformation 
A transformation has the following form: 


-- replacemant goes here..... 


The frst section containe the: Mt program’ fragment ‘which serves és the pattern, 
the: second section contaiie tiie replacetiént (also an ML program fragment), and 
the final section contains a set of conditiine'(if no doriditions ‘are needed, the final 
section may be omitted), ‘Terget statenients within the pattern are indicated by a 
double vertical bar to thelr left. For example: | : 


Bete acs | 
ne Leal fab... meh 


In thie traneformation the-fret three statements of the patteriy are" the target and © 
will be replaced by the. single statement ieplecement wher the “transformation is 
intervening statements) will be unchanged. The intent of the transformation Is to. 
use the short address, hi for the hotter eee oretrugt fori } 
three statements If the uftimete destination toes! ie not raw 
265 bytes). This transformation only | ‘panies: forwars we another 


me ou the frst 
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"transformation would be needed to accommodate jumps in the other direction. 
Other points to note: the use of duploste wid: carts ta epadify that the same IL 
component must appear in more than one place; the first end inst statement of the 

“matched fragment must have location attributes. | 

With one exception, exch component of the matched il. program fragment 
must be subsumed by come component of the pattern. The contents of the 
attribute field ere exempt from this. condition — attributes in the IL fragment that 
7 wild cerd to capture the unspecified attyibutge fpr. iater replication in the 

. replacement is not necessary as there are special rups. conceming. them in 
transparent to a traneformation; the information they conteia.ée automatically capled 
to the updated proprem wherever npemnnMy,. oe eee ey he ee any 
statement or. call by singly lachuding the sperepsiatn setignment ia the replacement. 
inthe example above, « location attibute le defient for the new:"ere” statement 
with the! same value ae the location attribute for the original "beq" #tatement. 

A new rvelue 19a. can aw be nd dic hed taal metainternreter by including 
an assignment to the telus (elallar 40 the Sefton tn ativbute) in the attribute 
field of the appropriete statement in, the replacement: for exumpte, the following 
transformation replaces the addition of two oonetantsi:with’ a store operation, ° 


Indicating that the. destination of the store has acquired: @: new ‘vateo which isthe 
sum of the constants. | 


The resuit from the call to add in the operand field of the replacement will be 
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automatically surrounded by quotes (to indicate that the new operand Is a literal). 
The number built-in function returns “true” if ite: asgument is a:numerte literal; the 
condition could be omitted enthaly as edd: ‘aborts: mer are not numeric 
literals, causing the. anenimatay: to fait. Note that the. rules mentioned in the 


previous paragraph will ensure that any attributes 


d for, 7dest in the original 
statement will be added to the attribute fleid fot. the store statement. Finally, it is 
worth pointing out that 70p1 ‘and 70p2 do riot need to be. literals in the original 
program — 7op1 and 70p2 need only be able to be reecived to literals when the 
transformation is ‘applied. For example, the statement “add © oO would match 
the pattern it <X> and <y> were both known to have conatarit values. These 
values would have been established In previews statements by Including 


assignments to <X> and “> in the ‘attribute feids of those avatars: 


§3.3.2 Constructing the replacement 

Two. capabilities a provided by. the replacement that have not been 
discussed previquely: the generation of ew ‘symbols unused: elsewhere in the 
program and the automatic handling of attributes.: Tha: ability’ to generate ‘an 
unused symbol le necessary when the transformation expands a wags. statement 
into & series of new statements as temporary bah vedd'by the tow statements 
need to be supplied names that are not used seeuners In the program. Aytomatic 
handing of attributes enables the designer to Ignore attutes wth which he le nt 
directly concerned and guarantees ‘that no are: information wilt be vost ‘through 
an oversight in composing the transformation. 

When expanding the specification of the replacement to arrive at the new 
program fragment all wild cards must be eliminated. if the wild card hes the same 
form aid ais as one which appeared in the pattem, the ti’ component matched by | 


that wild card serves as its value In the replacement. for ‘Instance, applying the 
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jast transformation in the previous section to 


It a7 wild card In the replacement dose not correspond to some wid card in the. 


pattern (i.e., its name is different from any used in the pattern), a new value is 
created to be used as its jae: The f new value Is guaranteed to be different from 
any used In the remainder of the tL program. Note thet the Gesignes it include 
any attributes to be associated with the new value as part of the transformation. 
If there are no wild cards in the pattern that: correspond: to $, 7", and 3" wild 
cerda in the replacement, the traneformation ia Wegal end wil never be applied. 


As an example of generated Welues consider:.the: following trarisformation 


concerned with the expansion of the subscript operator: 


<7t1> array:lower_bound | 


<7t2> <?erray>.":8ize 


<7t3> Parra 


The convert operator in the fret line of the replacement will coerce the value of 
the index to type "Integer" (eee §3.4 for a sample definition of convert). 7t1, 7t2, 


and 7t3 are all new celis which will be named when thie transformation is applied; . 


?ptr, 7array, and 7index will be taken from the subscript statement matched by the 
pattern. Note that pertinent attributes for the new celle have been defined in the 
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transformation. The attribute defined In the last line of the replacement Indicates 
that the. Syme ot he aie poled: by ‘et te am AH: he NR pt an 
element in the array being aubsctipted. 7 ue 

The following rules are used in patablalany attibutes for statements in the 
replacement: 


1. Every attrbade defiition-in the tangutonpemente wit te copied te . 
the attribute field of some statement in. the-ieplacement*by the 
metainterpreter when it eppiles the transforam@én. "Whites 
the statement chosen in the replacement flel®-wit have We same 
label as the defining statement in the target ~'this does net make 
any difference as far as defining the attribute dandethed; bit tt 
Improves the documentation value of the dafrittin. If tial? ‘are. ne. 
naan Ga oe ee 


§3.4 Example transformations 
‘The first example is a transformation which explode the coercion operator — 


script in the previous ‘section. “The convert 


used in the sample expansion of 
operator coerces its argument to have the type of destination ceil; It assumes that 


types are constrained’ to be one of “Intéger® of “fea”. 


ig 
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) ftom spelen ean be 


oe e Fe RT 


seperate traneformetions, one for each of, the, snees, iraatnds the. mount of 
optimization required te sobleve the same reeyit eq shove .would be considerably 
reduced, 

operator. Unlike the vescicnasaien ies. thease expansions must be done in 


of the store. 


Separate transformations. because of the use of the ALIAS operator’, The first 
transformation handies the case where the store operetion can be eliminated 
completely because the destination le a newly defined temporary end the value 
{The ALIAS operator, ike attributes, provides information which le independent of 


the flow of control; branches cannot prevent “execution” of the alias operation. 
Thus, Ee See ot See et eet een eoneee oe ee 


“Se? @PNRGaAS - & 
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being stored Is already contained In an accessible cell. in this case, all that needs 
to be done Is alias the temporary to the cell already containing the value 


(effectively renaming all occurrences of the temporary to use the cell name). 


The next two. ‘ranatormations transiate the store Instruction to the, appropriate 
machine Instruction, depending on the type of the destination, 


These bates franstormavons zoveree the fret — Program fragments: matched by the 
frst traneformation it also be ‘fiatctied by one of the other two transformations. 
It Is up to the metainterpreter..to. decide. which of the applicable. transformations to 
of the raduoed cgat of the resulting code... The ual, transforn 

store statements whose source and destination have diffagent types... 
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Yoo ye ae 


54.1 Exonphe: « shel trmmatater 
0 lle GE a an. ge Ts i, "Dine pncenns's 
———— edeades 


66 te> oO & & 


trplomanite the initial Wightwiet C) pregram. ee 
possible outcome might te: . is 
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integers (the same for both the source and machine language). In examining the 
assembly language program, it ls apperent that certain aeriventions miventiqne have been used 
in the translation: r6 te used: a the: focal tack trae pointer, ‘extemal variables 
are referenced by name, ecal (automatic) storage: “for blocks-te allocated from the 
stack and referenced -uelng: the local. stank frame:-painter,.and.so on. These 
conventions are ..established originally. by -the..designer end implemented by 
transformations in 9.¢traightforwerd fashion, | 

Although It. is possible to interpretively: apply the transformations and derive 
a tranglation,. the reader. should be reminded thet: the..main. goal of the 
transformations |e. to be desoriptive. Many..of the trenpfarmations. below employ 
attributes .end. conditions that represent. a.ransonahia: deeaription of.the information 
and. ponetraints Javolved in.a transfaemetion ~theee:transtormations are not the 
moet. elagant expression of the necessary -ayntactic: tranaformetion. in the final 
analysis, « transformation should be judged on the information it cenweye: end::pot 
how close It comes.to "the way it should really be done.” | 

The approach adopted” Tor: the: ‘organization of the “transformations Is as 


. for a stack 


follows: the Initial il program. s ‘fret translated into ‘wet 
architecture, then. the nedete peegram., &. wanolated ‘into torent machine 
Instructions. Optimizations eoxlet tor ‘each, jevel- ot. Interpadiate program ~ - sample 
high-level optimizations are, described in. 4.2, Bi orem in sa. and 
peephole machine: optimizations tn: $42. ae 

The first group of Gautcouen decuthes he process of storage 
‘for ‘each automatic variable ‘declared 


allocation. An “offset” ‘attribute Is lntroddced 
in the block, giving the variable’s offset from the base of the local stack frame; the 
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highest offset assigned is used in calculating the storage to be allocated for the 
| plock when it Is entered. 


In this transformation, the “begin” statement is treeleted % ‘instructions that 
allocate a stack frame of the appropriate site — the size (Temié:storage) te known 
to be a constant but Its value hes not yet been Geterilinad: The lest statement in 
the replacement initializes the offset for tater transferiintions — its ‘intial vaiue 
ignored by assembler and wil te ened tn the trenufermatitind: as ah eperater in 
statements where only the attribute fekis ere used: Comment stitemerite sisid be 
eliminated altogether and their apeociated wttrtuts Gafritions placed in attribute 
fleids of other statements; they 47s weed here Ww liirdve Wie'rendebitty of the IL 


are assigned offeste, external variables are detlered global In the first 


transformation, offsets are propanated with the ald of a comment atatemant that 
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<B>:size=2 
: C:offeet=e - 


: coe <C>:8lze=2 


T1 :typextemporary '<T1>:typesinteger 
ia aeaaaacs Ted typiinteger 


Figure 4.1: Sample program ed ‘dectaration ‘weneformations 


aived the current offset. The S*stat wild . gerd. wit: match -only. statement 
sequence thet do not contalh an ‘offset attibute defhition or "declaration" 
operator In any. statement {this ‘testrHotion be enbddied in the condition). Note that 
attributes defined for the deciared variabies will be amomatenty coped’: over to 


cea ore tae 


some replacement statement (in these cases, thera is only one). 


This tyenetoraston handice block exit after oh pepeancciata ‘have been processed, 
denascutng storage, for the hiock | and. domain the storage size attribute 
(?name:storage) for use during blook entry. The: condition is similar to that for 
automatic variable declarations. Figure 4.1 shows the IL program after these 
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_ Wensforwations hive baen applied. 

The next two traneformations trensiate “plus” and “assign” to stack 
operations. The Informit Ip tbe label Wht inaorsbtsi into the: operand fed 
of the new lastruotions and the thoe-eddress “plus” oparatiin le expanded into « 
series of one-adéress stack operations Sine: ceidrtane i rere in this 


case, 


generated. 
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The folowing two tanefomations perform simple setnizations on the stack 
machine code generated 0 ‘ar. Goth tranaformations improve on pop/push 
instruction pairs that have identical operands: the fest transtormation eliminates 
pairs whose arqunenbé: ‘are sonnet: the second. teanstormation converts pairs 
whose arguments are variable to @ copy. fromthe: top. s the ‘stack. Since 
temporaries were generated by - the. opmpller end. de. not represent user-visible 
quantities, they may ‘be. @heinated ‘during. ‘Optinizayon. rer 4.2 shows the 
example IL si ae ‘after translation’ to. stack Instructions. 


§4.2 Compiting past the machine interface 
In this section, we Meal Reh _ tranglating stack machine Programs. to target 


machine programs. The fret set of transtormations are a. 


} transiation 
of *push*, “pop”, “ooey", and “add” to POPT Times} ruction > The alge in bytes 
and number of ‘storage references requis ‘tr oats \ ache ‘instruction are 


7 indicated by the "size" oo “rete” att utes Vaid eh 
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Initial values for the “size” and *refs® attributes donot take: operands into account 
— the operands contributions wil be included when they are translated to legal 

The next group of transformations translates individual operands into the 
appropriate machine addresses. Recell that r& le used as the bese of frame 
pobiisir and thet extemal cpersauie aria eddrenoed by newbs. 
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abel | Qperetor Operands —______Mtributas_—_ 
|__| ?rator "before <?rand> ?*after 


?*before and 7*after. are ambiguous wid ‘cards used to select any component in 
ecin by the remaining component 
In the pattern’s operend fleld). ‘Note that, the’ _spnottonticn of “gize" and “refs 
tions, will only be applied to 


the operand fieid that has the correct form ¢ eae 


attributes in the pattems. ‘ensures that the. transi sfa 
machine Instructions. Figure 4.3: shows the IL. pil 
transformations (unused attributed have been eiminated tor brevity). 

The. most obvious optimization opportunity levelves a puah onto the stack (a 
“mov" instruction with a second argument. of ep)") follawed by. an instruction that 


‘am after, ‘application of. these 


pops the stack to ot tts source operand (ow Ineruction with a eat eniment of 
“(sp)+"). Since an "add" oan take the same * “eource’” operands es a “mov" 


instruction, the pumiper sequences can be reduced to a smote Instruction: 


Figure 4.4 shows the effect of this single optimization. 
Many other machine level optimizations are possible at this point; several 
optimizing transformations are listed below. These include removing superfluous 


zeroes in index expressions, eliminating additions with a zero operand, and 


Chapter Four — Compiling past the machine Interface 71. 


72. 


| 


i 

: i 
we + BE 

cae 


‘gokal 
Tete 
men 
edd 
mov 

add 


jETRELEEBOUG| 


cin: = push/ pop optimiza ‘tion 


Chapter Four — Compiling past the machine interface 


Figure 4.6 shows the it program after’ ais OeTCar eee 
comment and attributes eve been omitted. and 


Obviously, . additional ‘transfor matio . 


Dona pabe 


opportunities that arise from the translation th prog oma; roweven the bulk of 


§4.3 Interacting with the metainterpreter 
The ‘transformations in the previous section dealt with the translation of the 
‘rout program to a target machine program with Wie ttenton to the senantica of 
the initial iL program. ‘For the most part, the metainterpreter had only to choose 
which transformations to apply — ‘this task wea made farty simple for, in almost 
every case, if the transformation’s pattern: and conditions were met, it ian 
ecatoprate ta apply the tanctermation, Thie section explores how the capabilities 
ot the metahterereter oan be called into play to mprove the qualty of the 
reeuiting translation aoa 
“The fret example exploits the metalnterpreter’s ablity to perform certain 
agodpadne 


computations at compile tine. . Conair he ado of the fotowing tranetormatioe 
to the catalogue: | 


Chapter Four — interacting with the metainterpreter ae 73. 


aes sri i an se Se et 
"assign* end "plus" operators. Using the defen of Tadd” given In §3.2.2, the | 
eecond trenetormation wit only succeed if Tap! and Tep2 are numeric iterale By 
extending the metsinterpreter to support symbolic computation, both the 
vartcrton above wait be mat even fr areal oprane (teu the 
adé would swoowed at compte thas). The primary bene of such an extension 
would be a corresponding extension in the motuinterpreter’s sbifty to detect 
| hevtvig thoes transformations to the sama program in the fret section 
‘ia iecrabonpintar'con core thé tabbubig’ (uehie ieeanbak 
CA>a" 1" Cie?" <T1etCe*g", 


As a reeuk of this new information, the initial program can be modified as shown in 
Figuee 4.6 (update of Figure 4.2). By adding « trengtermation te sfininate:. esvigne 
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offset=0 | 
A:typezeutomatic A:offeet=0 
<A>:typeinteger <A>:size=2 
offaet=2 
B:typezexternal 

<B>: size=s2 


C:typesautomatic C:offset=2 
<C>:typesinteger <C>:sizes2 
offeet=4 


T1 itypestemporary <T1): typesinteger 


PROG:etorage: mae — dosinsbed 


Figure 4.7: Sample program after optimizations of §4.3 


i 


to subsequently ‘iaoed temporaries, the transformations of $4.2 can produce a 
program identical to the assembly bonmpiage program given ft in ne 1 (ooo Figure 4. 7). 


FES 


i 
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the clase of languages ne mastines that can be accommodated. 


Chapter 2 describes a general purpose intermediate janguage based on a 
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control and menagament of namas eng veluse. a 
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accessible to the metelnterprater without « detailed analysis of the actual 
operation performed by each statement. information about the fiow of contro! and 
the effect of each statement on the values of verlables can be easily determined 
from the label fleid of that statement. in addition, attributes provide a general 
statement. Attributes can be used to supply a symbol table facility for variables 
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allows be to be referenced by the transformations, permitting the translation of 
statements to be tailored ia response te..speciel:. properties ‘of the operands or 
opportunities presented by, the context. 

In Chapter 3, the . transformation. catalogue is iscyesed and. the 
metalanguage in which the individual trahsformatians ase:written je:presented. The 


leaving statements and components -unapeoified. through: the use. of wild cards. 
Each transformation contains two.ML progrem, fragments (kempiates): @ pattern that, 
along with a set. of conditions, specifies. the. I..progeam:. fragments to which the 
transformation can be applied, and a replacement that telis how to construct en 


updated IL program, Bulitein . functions...that allow: ,acesus. to--some of the 
metainterprater's knowledge. about |i . programa: ..ant. perform. some. simple 


"computations. on literals are provided —.theee: functions: axe used .in. constructing 
the replacement and conditions. .The cenditions seenniated:with « transformation 
specity. contextual. constraints that ere, not-related. tothe syntactic form:of the 
matched fragment. The.wide range of -iafermetioneveeble:: to @: transformation » 
enables the semantics .of. code generetion. to be -expreased as stemby-step . 

syntactic transformations of. the intermediate tanguage program. 

Chapter 4 presents a set. of example:.tranefermatinns asa specification for: 
tranalating a. rudimentary source. language. to PDR 1tdike essembly Janguage. As. 
suggested in §1.3.1, the transformations .are.orgenized .about the. use of on 
abstract machine (in this case, with a stack arobttectura). ‘me Initial vende” to 


a Yo 


stack machine Instructions allows several ‘optimizations to be ) accomplished that 
would have otherwise. ‘been difeutt (e. g- the removal of unnecessary ‘tomporarios 
Inserted by the frst phase of the e compiler). Several transformations that allow ‘the 


matainterprater to inter the run ‘time. vane’ of the variables ‘and subsequently 
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$6.2 An overview of the matainterpreter, 
Thecughat arter-pertions oft: theate the, metaiterpreter hae been 
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seped eer name oven eereete ee aeene a neein 


nats 


reeponehittiee of the metetnterpreter ful ir'two sein arom: bockteoping and fou 


@ transiation of attribute references to their corresponding values 
wherever possible. {if any unresolved attribute references remain 
after complation of the: tren = ue 
should abort, Indicating an inconsistent IL program. 


© evaluation of built-in functions. “it a function application aborts (¢.¢., 
beoauen of, domain -errore), it is. saved: far reevaluation later: in the. 


© propagation of rvalue dortaton in combination with data from flow 
: to. replace. -rvelue operands: wills Sterals 
_Fepresenting the known value of the rvahie 


@ application of a chosen traneformation. “information obtained during 
specification (along ‘with any ‘generated eyaiela) to, create a 
Ce otion of the Sarget.4 panta,in the ‘natteen: During the « 
ou the replacement, many of the other bookkeeping 
ad. then end.there, eliminating the:need for : 

extra passes over the IL program. 


Two. other ‘tasks fall in this area: ahrecting for r termination conditions & and choosing 
which transformation to apply next. Eas a | 
-§1.8.2 outtines how to tell ‘when the translation le ‘tomplete: a measure of 
the programe optimality le computed using « formula (n the ase, Inoiving the 
values of attributes associated with every statement) eupplied by the user — if the 
" caloulation ‘aborts because some statement does sot have ‘the appropriate 
attributes, the application of more transformations ‘ “called for, if no 1 more, 
transformations are applicable, backtracking le called for. it the measure can be 
computed, It Is used to remember the best ‘tranelation ‘found to date and the 
metainterpreter backtracks” ‘to “find other tranelations. ‘Gacktracking Involves 
undoing the last successful. ‘rapetormetion ant eopiying. come: other. transtornation 
(repeating for another level if all the applicable transformations have been: applied 
at this leve)). Exhaustive search of: the: tegnatenmation. deamonk- becmwokiot #f°the 
user supplies a "trigger" value for the. wensinn se:cansipengram wl whose: measure is 
lese than the trigger value is considered sn omentalte:sermeeer ‘and: becomes 
the final output. Often the, transformations ore onetructed: Jn :eush: e. way ‘tat the | 
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possible when rvalue information is aaa 

The required flow analysie could be done amew.at the completion of every 
transformation application but this would be incredibly inefficient. +: prohibitive: for 
large programs. The bit vector methods cuttin oe LAchatz} ond {Mtman] offer an 
efficient ‘representation of the date fiew. Information that oan be incrementally 
updated as jong as the undertying. flow. araph.ig,.208, cheaged (except te: add/delete 
more straight-line code or loops completely, eguteined, Jaythe added code). Thus, 
the: more ae Coneuming. Iterative: caloulation. required. when tie fow greph le not 
known need onty be ‘performed when a trapetoqpation affects the branches and 
joins of the graph. A large percentage of {ranstorpations da.shot affect the graph 
itself — all of the transformations In Chapter 8. opusd -be accommodated by 
Incremental ee | ° 

In a diferent vein, Bode .metion...out of haepe,: eeaination of «Induction 
variables, etc. (ane [Aho77b] for . a large aamola) paoranent other optiaization that 
could be Hoorporated Jo. the, metalvtarpeater,. AM, algultime ‘are developed for 
register allocation and optimal ordering of expresaion execution, these: will also be 
prime candidates for inohision, Our shopping #et-can.eaelly graw:must fester then 
our ability to implement the elgorithme effectively. within:the framework provided by 
the metainterpreter. Fortunately, some trangloreetions: ani such more tmportant 
code generator. 


§6.3 Directions tor future research 
Two avenues of research are ate extenions of see! work reported here. 


The examples of Chapters 8 and 4 indicate thet much feproverent could be made : 
to the usability of the metelenguage. Many operations commonly performed during 
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siectan could lead to a very competent compiler that Is Pear maintained and 
modified to produce code for different target machines. 

Many other implementation approaches lle further off the beaten path. One 
of the most interesting Is the prospect of creating a “compliied" code generator 
based on an analysia of the specifiaation.: Such compliation would require 
extensive Information on the Interaction between components of the specification; 
the metacompiler would have to understand? the.effect-of such transformation in a 
much more fundamental way than. la. needed: .trem.en., loterprative “approach. 
Comping the specification isiotld eliminate much of the searching and backtracking 
destribed in the begining of $6.2 with the result of a vast improvement in the 
performance. of . the _ ode | generator. The potpncmneetan ‘phese- wait aimost 
certainty be nectasary If the performance of -cur.onde. generator Je to approach 
that of conventional ad hoc.code generatocs.. 

“Matecompllation Ie closely related, to current work:in.the. field of automatic 
program. synthesis. The. specification by: ste dLIM. system has@any of 
the same characteristics as descriptions used in eee eynthesie eyetems [Green]: 
a pettermbased tranéformation system ls: unos : -: the: nevtadge:baseby both 
systems. This Commonelity. promMege. to elu: many a: the: same. techniques to be" 
used In the analysis of the specification. This area of research is stil virgin: 
terry withthe same promises of success and ature alone by any front. 
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